Reader small image

You're reading from  Effective Concurrency in Go

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781804619070
Edition1st Edition
Concepts
Right arrow
Author (1)
Burak Serdar
Burak Serdar
author image
Burak Serdar

Burak Serdar is a software engineer with over 30 years of experience in designing and developing distributed enterprise applications that scale. He's worked for several start-ups and large corporations, including Thomson and Red Hat, as an engineer and technical lead. He's one of the co-founders of Cloud Privacy Labs where he works on semantic interoperability and privacy technologies for centralized and decentralized systems. Burak holds BSc and MSc degrees in electrical and electronics engineering, and an MSc degree in computer science.
Read more about Burak Serdar

Right arrow

Worker Pools and Pipelines

This chapter is about two interrelated concurrency constructs: worker pools and pipelines. While a worker pool deals with splitting work among multiple instances of the same computation, a data pipeline deals with splitting work into a sequence of different computations, one after the other.

In this chapter, you will see several working examples of worker pools and data pipelines. These patterns naturally come up as solutions to many problems, and there is no single best solution. I try to separate the concurrency concerns from the computation logic. If you can do the same for your problems, you can iteratively find the best solution for your use case.

The topics that this chapter will cover are as follows:

  • Worker pools, using a file scanner example
  • Data pipelines, using a CSV file processor example

Technical Requirements

The source code for this particular chapter is available on GitHub at https://github.com/PacktPublishing/Effective-Concurrency-in-Go/tree/main/chapter5.

Worker pools

Many concurrent Go programs are combinations of variations on worker pools. One reason could be that channels provide a really good mechanism for assigning tasks to waiting goroutines. A worker pool is simply a group of one or more goroutines that performs the same task on multiple instances of inputs. There are several reasons why a worker pool may be more practical than creating goroutines as needed. One reason is that creation of a worker instance in the worker pool could be expensive (not the creation of a goroutine, that’s cheap, but the initialization of a worker goroutine can be expensive), so a fixed number of workers can be created once and then reused. Another reason is that you potentially need an unbounded number of them, so you create a reasonable number once. Regardless of the situation, once you decide you need a worker pool, there are easy-to-repeat patterns that you can use over and over to create high-performing worker pools.

We first saw a...

Pipelines, fan-out, and fan-in

Many times, a computation has to go through multiple stages that transform and enrich the result. Typically, there is an initial stage that acquires a sequence of data items. This stage passes those data items one by one to successive stages, where each stage operates on the data, produces a result, and passes it on to the next stage. A good example is image processing pipelines, where the image is decoded, transformed, filtered, cropped, and encoded into another image. Many data processing applications work with large amounts of data. Therefore, a concurrent pipeline can be essential for acceptable performance.

In this chapter, we will build a simple data processing pipeline that reads records from a comma-separated values (CSV) text file. Each record contains a height and weight measurement for a person captured as inches and pounds. Our pipeline will convert these measurements to centimeters and kilograms, then output them as a stream of JSON objects...

Summary

In this chapter, we studied worker pools and pipelines – two patterns that show up in different shapes and forms in almost every non-trivial project. There are many ways these patterns can be implemented with different runtime behaviors. You should build your systems so that they do not rely on the exact structure of the pipeline or the worker pools. I tried to show some ways to abstract away concurrency concerns from computation logic. These ideas may make your job easier when you need to iterate among different designs.

Next, we will talk about error handling and how error handling can be added to these patterns.

Questions

  1. Can you change the worker implementation so that the submitted work can be canceled by the caller?
  2. Many languages offer frameworks with dynamically sized worker pools. Can you think of a way to implement that in Go? Would that worker pool be more performant than a fixed-sized worker pool that uses the same number of goroutines as the maximum for the dynamically sized one?
  3. Try writing a generic fan-in/fan-out function (without ordering) that takes n input channels and m output channels.
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Effective Concurrency in Go
Published in: Apr 2023Publisher: PacktISBN-13: 9781804619070
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Burak Serdar

Burak Serdar is a software engineer with over 30 years of experience in designing and developing distributed enterprise applications that scale. He's worked for several start-ups and large corporations, including Thomson and Red Hat, as an engineer and technical lead. He's one of the co-founders of Cloud Privacy Labs where he works on semantic interoperability and privacy technologies for centralized and decentralized systems. Burak holds BSc and MSc degrees in electrical and electronics engineering, and an MSc degree in computer science.
Read more about Burak Serdar