Building on the concurrency fundamentals, this chapter explores production-grade patterns used in real-world Go applications. These are not academic exercises — they are the patterns you will find in Kubernetes, Docker, CockroachDB, and virtually every serious Go service.
A worker pool limits the number of concurrent goroutines to prevent resource exhaustion. Think of it like a restaurant kitchen: you have a fixed number of cooks (workers), and orders (jobs) queue up in a ticket rail (channel). Adding more cooks beyond the kitchen’s capacity does not make things faster — it causes collisions and chaos. The worker pool pattern gives you control over that concurrency level.
package mainimport ( "fmt" "sync" "time")type Job struct { ID int Payload string}type Result struct { JobID int Output string Error error}func worker(id int, jobs <-chan Job, results chan<- Result, wg *sync.WaitGroup) { defer wg.Done() for job := range jobs { // Simulate work time.Sleep(100 * time.Millisecond) results <- Result{ JobID: job.ID, Output: fmt.Sprintf("Worker %d processed: %s", id, job.Payload), } }}func main() { const numWorkers = 5 const numJobs = 20 jobs := make(chan Job, numJobs) results := make(chan Result, numJobs) var wg sync.WaitGroup // Start workers for i := 1; i <= numWorkers; i++ { wg.Add(1) go worker(i, jobs, results, &wg) } // Send jobs for i := 1; i <= numJobs; i++ { jobs <- Job{ID: i, Payload: fmt.Sprintf("task-%d", i)} } close(jobs) // Wait for workers and close results go func() { wg.Wait() close(results) }() // Collect results for result := range results { fmt.Printf("Job %d: %s\n", result.JobID, result.Output) }}
Pipelines are a series of stages connected by channels, where each stage is a group of goroutines running the same function. Think of it like an assembly line in a factory: raw materials enter stage 1, get processed, move to stage 2, get transformed further, and emerge as finished products. Each stage runs independently and communicates only through the channel connecting it to the next stage.
// Stage 1: Generate numbersfunc generate(nums ...int) <-chan int { out := make(chan int) go func() { defer close(out) for _, n := range nums { out <- n } }() return out}// Stage 2: Square numbersfunc square(in <-chan int) <-chan int { out := make(chan int) go func() { defer close(out) for n := range in { out <- n * n } }() return out}// Stage 3: Filter even numbersfunc filterEven(in <-chan int) <-chan int { out := make(chan int) go func() { defer close(out) for n := range in { if n%2 == 0 { out <- n } } }() return out}func main() { // Build pipeline: generate -> square -> filter nums := generate(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) squared := square(nums) evens := filterEven(squared) // Consume for n := range evens { fmt.Println(n) // 4, 16, 36, 64, 100 }}
func generateWithContext(ctx context.Context, nums ...int) <-chan int { out := make(chan int) go func() { defer close(out) for _, n := range nums { select { case <-ctx.Done(): return case out <- n: } } }() return out}func squareWithContext(ctx context.Context, in <-chan int) <-chan int { out := make(chan int) go func() { defer close(out) for n := range in { select { case <-ctx.Done(): return case out <- n * n: } } }() return out}
Fan-out: Multiple goroutines read from the same channel until it’s closed.Fan-in: A function reads from multiple inputs and multiplexes onto a single channel.
// Fan-out: distribute work across multiple workersfunc fanOut(in <-chan int, numWorkers int) []<-chan int { outputs := make([]<-chan int, numWorkers) for i := 0; i < numWorkers; i++ { outputs[i] = worker(in) } return outputs}func worker(in <-chan int) <-chan int { out := make(chan int) go func() { defer close(out) for n := range in { // Process and send result out <- n * n } }() return out}// Fan-in: merge multiple channels into onefunc fanIn(channels ...<-chan int) <-chan int { out := make(chan int) var wg sync.WaitGroup // Start a goroutine for each input channel for _, ch := range channels { wg.Add(1) go func(c <-chan int) { defer wg.Done() for n := range c { out <- n } }(ch) } // Close output when all inputs are done go func() { wg.Wait() close(out) }() return out}// Generic fan-infunc FanIn[T any](ctx context.Context, channels ...<-chan T) <-chan T { out := make(chan T) var wg sync.WaitGroup for _, ch := range channels { wg.Add(1) go func(c <-chan T) { defer wg.Done() for { select { case <-ctx.Done(): return case v, ok := <-c: if !ok { return } select { case <-ctx.Done(): return case out <- v: } } } }(ch) } go func() { wg.Wait() close(out) }() return out}
Limit concurrent access to a resource using a buffered channel as a semaphore. This is an elegant Go idiom: a buffered channel of struct{} (empty structs, which consume zero memory) naturally acts as a counting semaphore. Sending to the channel “acquires” a slot, receiving from it “releases” one.
type Semaphore struct { sem chan struct{}}func NewSemaphore(maxConcurrent int) *Semaphore { return &Semaphore{ sem: make(chan struct{}, maxConcurrent), }}func (s *Semaphore) Acquire() { s.sem <- struct{}{}}func (s *Semaphore) Release() { <-s.sem}func (s *Semaphore) TryAcquire() bool { select { case s.sem <- struct{}{}: return true default: return false }}// Usagefunc processWithLimit(items []string, maxConcurrent int) { sem := NewSemaphore(maxConcurrent) var wg sync.WaitGroup for _, item := range items { wg.Add(1) sem.Acquire() go func(item string) { defer wg.Done() defer sem.Release() // Process item process(item) }(item) } wg.Wait()}
var cache sync.Mapfunc getOrCompute(key string, compute func() interface{}) interface{} { if val, ok := cache.Load(key); ok { return val } val := compute() actual, _ := cache.LoadOrStore(key, val) return actual}
You need to process 1 million items from a queue with at most 100 concurrent workers. Design the solution, explain your channel sizing decisions, and tell me what can go wrong.
Strong Answer:
The core pattern is a worker pool: a buffered jobs channel, N worker goroutines reading from it, and a results channel for output. I would create jobs := make(chan Job, 200) (2x worker count for pipeline smoothness), spawn 100 workers, and have the main goroutine feed jobs into the channel. Each worker runs for job := range jobs { result := process(job); results <- result }. A separate goroutine collects from the results channel.
Channel sizing: the jobs channel buffer should be large enough that the producer does not block on every send (which would serialize the work), but small enough that you do not buffer millions of items in memory. 2x to 5x the worker count is a practical sweet spot. The results channel should be similarly sized.
What can go wrong: First, if workers panic, they die silently and the pool shrinks. Wrap each worker’s body in a recover to catch panics, log them, and continue processing. Second, if the results consumer is slower than the producers, the results channel fills up and workers block, creating back-pressure that stalls the whole pool. Monitor channel lengths in production. Third, if you forget to close the jobs channel after sending all items, workers block forever waiting for more jobs — a goroutine leak. Fourth, error handling: if one job fails, do you skip it, retry it, or stop everything? Use errgroup.WithContext to cancel all workers on the first error if fail-fast is desired.
For 1 million items, memory is a concern. Do not load all items into memory at once. Feed them into the jobs channel on demand (streaming from the source), so memory usage is bounded by the channel buffer size times the item size.
Follow-up: Compare errgroup with sync.WaitGroup. When would you choose one over the other?sync.WaitGroup is for “wait until all goroutines finish” — it has no error handling. You call Add(1) before launching, Done() when finished, and Wait() to block. If any goroutine fails, you need separate error collection logic. errgroup.Group adds error propagation: each goroutine function returns an error, and g.Wait() returns the first non-nil error. Combined with errgroup.WithContext, it also propagates cancellation — when one goroutine errors, the context is cancelled, signaling other goroutines to stop. Choose WaitGroup for fire-and-forget goroutines where errors are handled internally (logging, retry). Choose errgroup when errors should propagate to the caller, when you want automatic cancellation on failure, or when you need to limit concurrency with g.SetLimit(n). In practice, I default to errgroup for production code because I almost always want error propagation.
Explain the fan-out/fan-in pattern. Walk me through a production scenario where you used it, and describe how you handled backpressure.
Strong Answer:
Fan-out means distributing work from one channel to multiple worker goroutines. Fan-in means merging results from multiple channels into a single channel. Together, they allow parallel processing of a stream of work.
Production scenario: an image processing pipeline. Stage 1 reads image URLs from a queue (single producer). Fan-out: 10 goroutines each download images concurrently. Fan-in: their results merge into a single channel. Stage 2: 5 goroutines resize images. Fan-in again. Stage 3: a single goroutine uploads results to S3.
For the fan-in, each input goroutine sends to the shared output channel, and a coordinator goroutine uses a WaitGroup to know when all inputs are done, then closes the output channel. The pattern is: for each input channel, launch a goroutine that reads from it and sends to the output; when the input channel closes, call wg.Done(). A separate goroutine does wg.Wait(); close(output).
Backpressure handling: if stage 2 is slower than stage 1, the fan-in channel between them fills up. Buffered channels provide a small shock absorber, but if the rate mismatch is sustained, the fast stage blocks on channel sends — this IS the backpressure mechanism. The producer slows down naturally because it cannot send faster than the consumer can receive. This is healthy — it prevents memory exhaustion. If you do not want the producer to block, use a separate approach: drop items (lossy), write overflow to disk (spill-to-disk), or dynamically scale the number of workers based on channel length.
Critical detail: every stage must check ctx.Done() in its select statement. If any stage fails or the request is cancelled, cancellation propagates through the entire pipeline via context, and all goroutines exit cleanly.
Follow-up: What is sync.Once and when would you use it instead of init()?sync.Once guarantees a function executes exactly once, even when called from multiple goroutines concurrently. The first caller executes the function, and all other callers block until it completes, then return immediately. Use it for lazy initialization: var dbOnce sync.Once; func getDB() *sql.DB { dbOnce.Do(func() { db = openDB() }); return db }. This is better than init() because: the initialization happens on first use, not at import time; you can handle errors (init cannot return errors); and the initialization is explicit rather than implicit. Use init() only for truly package-level, side-effect registration (like database drivers). Use sync.Once for everything else that needs one-time initialization, especially when the initialization involves resources that might not be needed.
Your service uses a circuit breaker for calls to a payment provider. Explain the three states, how transitions work, and a production scenario where the circuit breaker saved you.
Strong Answer:
The three states are Closed (normal operation — requests pass through), Open (circuit tripped — requests fail immediately without calling the downstream service), and Half-Open (recovery probe — a limited number of requests are allowed through to test if the downstream service has recovered).
Transitions: Closed to Open happens when the failure rate exceeds a threshold (e.g., 60% of requests in the last 10 seconds failed). Open to Half-Open happens after a timeout period (e.g., 30 seconds). Half-Open to Closed happens when a configurable number of consecutive successes occur. Half-Open to Open happens when any request in the probe phase fails.
Production scenario: a payment provider experienced a partial outage where 80% of requests timed out after 10 seconds. Without a circuit breaker, our service would queue up thousands of goroutines each waiting 10 seconds for a timeout, exhausting our connection pool and goroutine capacity, making our entire service unresponsive — even for endpoints that do not use the payment provider. With the circuit breaker, after 3 failures, it opened and started returning errors immediately (in microseconds instead of 10 seconds). Users saw a “payment temporarily unavailable” message instantly instead of a 10-second hang. After 30 seconds, the half-open state probed the provider, detected it was back, and closed the circuit. Total impact: 30 seconds of degraded payment functionality instead of a cascading failure that would have taken down the entire service.
Implementation note: in Go, I use the sony/gobreaker library rather than rolling my own. The key configuration parameters are: failure threshold (how many failures to trip), timeout (how long to stay open), and the ready-to-trip function (which can use failure ratio, absolute counts, or latency percentiles).
Follow-up: How do you prevent the circuit breaker from oscillating between open and closed under sustained partial failure?The half-open state with a success threshold prevents oscillation. Instead of closing the circuit on a single success, you require N consecutive successes (typically 3-5) in the half-open state before transitioning to closed. If any half-open request fails, the circuit goes back to open with a potentially longer timeout (exponential backoff on the open-to-half-open timeout). Some implementations also use a sliding window in the closed state: instead of tripping on N absolute failures, they trip when the failure RATE exceeds a percentage over a time window. This prevents a single transient error from opening the circuit while still detecting sustained issues. The gobreaker library supports all of these through its Settings.ReadyToTrip callback where you can implement custom logic based on the failure ratio over the observation window.