木鸟杂记

大规模数据系统

Golang Notes (II): Context Source Code Analysis

go-context-tree-construction.pnggo-context-tree-construction.png

Overview

Context is a somewhat unique yet commonly used concept in Go. When used well, it often yields twice the result with half the effort. But when abused without understanding its internals, it becomes “writing new words for forced sorrow”—at best affecting code structure, at worst burying numerous bugs.

Golang constructs Contexts using a tree-like derivation approach, passing deadline and cancel signals across different goroutines [1] to manage the lifecycle of a group of goroutines involved in processing a task, preventing goroutine leaks. It also allows passing/sharing data across an entire request via Values attached to the Context.

Context is most often used to track the lifecycle of long-running, cross-process IO requests such as RPC/HTTP, allowing the outer caller to actively or automatically cancel the request, thereby instructing child goroutines to reclaim all used goroutines and related resources.

Context is essentially a mechanism for propagating signals during tree-like nested API calls. This article will analyze Context from several aspects: interface, derivation, source code analysis, and usage.

Author: Muniao’s Miscellany https://www.qtmuniao.com/2020/07/12/go-context/, please cite the source when reposting

The Context Interface

The Context interface is defined as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Context is used to pass deadlines, cancellation signals, and request-scoped key-value pairs across API boundaries.
// The methods of Context are safe for concurrent use by multiple goroutines.
type Context interface {
// Done returns a read-only channel that is closed when the Context is canceled or times out.
Done() <-chan struct{}

// Err returns the error indicating why the Context ended.
Err() error

// If a deadline is set on the Context, Deadline returns the deadline time.
Deadline() (deadline time.Time, ok bool)

// Value returns the value associated with the given key, or nil.
Value(key interface{}) interface{}
}

The above are abbreviated comments; for detailed interface information, see the Context godoc.

  • The Done() method returns a read-only channel. When a Context is actively canceled or times out, the done channel of this Context and all its derived Contexts are closed. After receiving the close signal via this field, all child goroutines should immediately interrupt execution, release resources, and return.

  • Err() returns nil before the above channel is closed, and after the channel is closed, it returns information about why the Context was closed. The error type has only two values, canceled or deadline exceeded:

1
2
var Canceled = errors.New("context canceled")
var DeadlineExceeded error = deadlineExceededError{}
  • Deadline() If this Context has a deadline set, the function returns ok=true and the corresponding expiration time. Otherwise, it returns ok=false and nil.

  • Value() returns the value bound to the given Key on the Context chain (I call it the lookup chain, explained below). If not found, it returns nil. Note: do not use it to pass parameters between functions; its original intent is to share values that span the entire Context lifecycle. The Key can be any comparable type. To prevent key collisions, it’s best to define the Key type as an unexported type and define accessors for it. Here’s an example of sharing user information via Context:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
package user

import "context"

// User is the type of value stored in Contexts.
type User struct{...}

// key is defined as an unexported type to avoid conflicts with keys defined in other packages.
type key int

// userKey is the key for user.User values in Contexts. It is unexported;
// clients use user.NewContext and user.FromContext instead of using this key directly.
var userKey key

// NewContext returns a new Context that carries value u.
func NewContext(ctx context.Context, u *User) context.Context {
return context.WithValue(ctx, userKey, u)
}

// FromContext returns the User value stored in ctx, if any.
func FromContext(ctx context.Context) (*User, bool) {
u, ok := ctx.Value(userKey).(*User)
return u, ok
}

Deriving Contexts

The beauty of Context design lies in its ability to derive tree-like structures from an existing Context to manage the lifecycle of a group of goroutines. As mentioned above, a single Context instance is immutable, but can be derived and augmented with additional properties (cancelable, deadline, key-value) using three methods provided by the context package: WithCancel, WithTimeout, and WithValue, to construct a tree-organized set of Contexts.

go-context-tree.pnggo-context-tree.png

When the root Context ends, all Contexts derived from it are also canceled. That is, the parent Context’s lifecycle encompasses all child Contexts’ lifecycles.

context.Background() is typically used as the root node; it never times out and cannot be canceled.

1
2
3
// Background returns an empty Context. It is never canceled, has no deadline, and has no values.
// Background is typically used in main, init, and tests as the root Context of a long-running process.
func Background() Context

WithCancel and WithTimeout can derive new Contexts from a parent Context, returning new Contexts constrained by the parent Context’s lifecycle.

A Context derived from context.Background() via WithCancel should be canceled promptly after the corresponding process completes, otherwise it will cause a Context leak.

Using WithTimeout can control the processing deadline of a process. Specifically, when the deadline is reached, the Context sends a signal to the Done Channel, and the child goroutine, upon detecting the signal in the Context Done Channel [2], will exit immediately.

1
2
3
4
5
6
7
8
9
10
11
// WithCancel returns a copy of parent with a new Done channel and a cancel function.
// The parent's Done channel is closed when the parent is canceled or when the returned cancel function is called.
func WithCancel(parent Context) (ctx Context, cancel CancelFunc)

// A CancelFunc tells an operation to abandon its work.
type CancelFunc func()

// WithTimeout returns a copy of parent with a new Done channel and a cancel function.
// The parent's Done channel is closed when the parent is canceled, the returned cancel function is called,
// or the timeout expires. If the internal timer is still running when cancel is called, it will be stopped.
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)

WithValue can attach key-value pairs to the Context for the entire processing lifecycle.

1
2
// WithValue returns a copy of parent in which the value associated with key is val.
func WithValue(parent Context, key interface{}, val interface{}) Context

Context Source Code Analysis

The Go context package uses embedding, similar to inheritance, to organize several Context classes: emptyCtx, valueCtx, cancelCtx, and timerCtx.

go-context-implementation.pnggo-context-implementation.png

Put figuratively, through embedding, Go constructs a “pointer” from each Context node in the tree-organized Context system to its parent instance. From another perspective, this is a classic code organization pattern—the Composite Pattern: each layer incrementally or overridingly implements only the functionality it cares about, then reuses existing implementations via routing calls.

emptyCtx

emptyCtx implements an empty Context; all interface methods are no-ops.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
type emptyCtx int

func (*emptyCtx) Deadline() (deadline time.Time, ok bool) {
return
}
func (*emptyCtx) Done() <-chan struct{} {
return nil // Returning nil is syntactically a no-op, semantically meaning this Context will never be closed.
}
//... Others omitted, similar empty function bodies that satisfy the syntax requirements.

func (e *emptyCtx) String() string {
switch e {
case background:
return "context.Background"
case todo:
return "context.TODO"
}
return "unknown empty Context"
}

Both context.Background() and context.TODO() return instances of emptyCtx, though their semantics differ slightly. The former serves as the root node of the Context tree, while the latter is typically used when unsure what to use.

1
2
3
4
5
6
7
8
9
10
11
12
var (
background = new(emptyCtx)
todo = new(emptyCtx)
)

func Background() Context {
return background
}

func TODO() Context {
return todo
}

valueCtx

valueCtx embeds a Context interface for Context derivation and attaches a single key-value pair. As can be seen from the context.WithValue function, each attached key-value pair wraps a new valueCtx. When accessing a Key via the Value(key interface{}) interface, it traverses upward along the Context tree lookup chain through all Contexts until reaching emptyCtx:

  1. If it encounters a valueCtx instance, it compares its key with the given key for equality.
  2. If it encounters another Context instance, it delegates upward directly. But there is a special case: to obtain the nearest cancelCtx among all ancestor nodes of a given Context, Go uses a special key: cancelCtxKey; when encountering this key, cancelCtx returns itself. This will be mentioned in the cancelCtx implementation.

For other interface calls (Done, Err, Deadline), they are routed to the embedded Context.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
type valueCtx struct {
Context // embedding, pointing to the parent Context
key, val interface{}
}

func (c *valueCtx) Value(key interface{}) interface{} {
if c.key == key {
return c.val
}
return c.Context.Value(key)
}

func WithValue(parent Context, key, val interface{}) Context {
if key == nil {
panic("nil key")
}
if !reflectlite.TypeOf(key).Comparable() {
panic("key is not comparable")
}
return &valueCtx{parent, key, val} // attach kv and reference parent Context
}

cancelCtx

The core implementation of the context package lies in cancelCtx, including constructing the tree structure and performing cascading cancellation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
type cancelCtx struct {
Context

mu sync.Mutex // protects the following three fields
done chan struct{} // lazily initialized, closed by the first cancel() call
children map[canceler]struct{} // set to nil by the first cancel() call
err error // set to non-nil by the first cancel() call
}

func (c *cancelCtx) Value(key interface{}) interface{} {
if key == &cancelCtxKey {
return c
}
return c.Context.Value(key)
}

func (c *cancelCtx) Done() <-chan struct{} {
c.mu.Lock()
if c.done == nil {
c.done = make(chan struct{})
}
d := c.done
c.mu.Unlock()
return d
}

The Value() function implementation is interesting: when encountering the special key cancelCtxKey, it returns itself. This actually reuses the lookup logic of the Value function, so that when traversing the Context tree lookup chain, the first ancestor cancelCtx instance of the given Context can be found.

children stores all cancelable Contexts (implementing the canceler interface, such as cancelCtx or timerCtx nodes) that are the first reachable ones when walking down all paths in the subtree. Refer to the figure below for an intuitive understanding.

The following will explain each in detail.

Lookup Chain

The lookup chain is constructed by the context package using Go’s embedding (embedding) feature, and is mainly used for:

  1. Looking up matching key-value pairs upward along the chain when Value() is called.
  2. Reusing the Value() logic to find the nearest cancelCtx ancestor, in order to construct the Context tree.

Among valueCtx, cancelCtx, and timerCtx, only cancelCtx directly implements a non-empty Done() method (valueCtx and timerCtx both delegate via embedding, and calling this method will directly forward to cancelCtx or emptyCtx). Therefore, done := parent.Done() will return the done channel from the first ancestor cancelCtx. However, if there is a third-party implementation of the Context interface in the Context tree, parent.Done() might return another channel.

Thus, if p.done != done, it indicates that the first Context implementing a non-empty Done() encountered in the lookup chain is a third-party Context, not cancelCtx.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// parentCancelCtx returns the first ancestor cancelCtx node of parent.
func parentCancelCtx(parent Context) (*cancelCtx, bool) {
done := parent.Done() // Call the first instance implementing Done() in the lookup chain (third-party Context class / cancelCtx)
if done == closedchan || done == nil {
return nil, false
}
p, ok := parent.Value(&cancelCtxKey).(*cancelCtx) // First cancelCtx instance in the lookup chain
if !ok {
return nil, false
}
p.mu.Lock()
ok = p.done == done
p.mu.Unlock()
if !ok { // Indicates the first instance implementing Done() in the lookup chain is not a cancelCtx instance
return nil, false
}
return p, true
}

Tree Construction

Context tree construction is performed via propagateCancel when context.WithCancel() is called.

1
2
3
4
5
func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
c := newCancelCtx(parent)
propagateCancel(parent, &c)
return &c, func() { c.cancel(true, Canceled) }
}

The Context tree can essentially be refined into a canceler (*cancelCtx and *timerCtx) tree, because during cascading cancellation we only need to find all cancelers in the subtree. Therefore, the implementation only needs to save all canceler relationships in the tree (skipping valueCtx), which is simple and efficient.

1
2
3
4
5
6
// A canceler is a context type that can be canceled directly. The
// implementations are *cancelCtx and *timerCtx.
type canceler interface {
cancel(removeFromParent bool, err error)
Done() <-chan struct{}
}

The specific implementation is: along the lookup chain, find the first instance that implements the Done() method,

  1. If it is a canceler instance, it must have a children field and implement the cancel method (canceler); just put this context into the children map. After that, when the parent cancelCtx cancels, it will recursively traverse all children and cancel them one by one.
  2. If it is a non-canceler third-party Context instance, since we don’t know its internal implementation, we can only start a guardian goroutine for each newly added child Context; when the parent Context is canceled, cancel this Context.

It should be noted that since Context may be accessed concurrently by multiple goroutines, when modifying class fields, we need to double-check whether the parent node has already been canceled. If the parent Context is canceled, immediately cancel the child Context and exit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
func propagateCancel(parent Context, child canceler) {
done := parent.Done()
if done == nil {
return // Parent is not cancelable
}

select {
case <-done:
// Parent already canceled
child.cancel(false, parent.Err())
return
default:
}

if p, ok := parentCancelCtx(parent); ok { // Found a cancelCtx instance
p.mu.Lock()
if p.err != nil {
// Parent already canceled
child.cancel(false, p.err)
} else {
if p.children == nil {
p.children = make(map[canceler]struct{}) // Lazily created
}
p.children[child] = struct{}{}
}
p.mu.Unlock()
} else { // Found a non-cancelCtx instance
atomic.AddInt32(&goroutines, +1)
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err())
case <-child.Done():
}
}()
}
}

The figure below explains the lookup chain and tree organization: C0 is emptyCtx, usually obtained from context.Background(), serving as the root node of the Context tree. C1~C4 are successively derived from their respective parent nodes via embedding. The dashed lines are the lookup chain formed by embedding (embedded), and the solid lines are the parent-child relationships saved by the cancelCtx children map.

parentCancelCtx(C2) and parentCancelCtx(C4) both return C1, so C1’s children map stores C2 and C4. After building these two relationships, we can query Value upward along the lookup chain, including finding the first ancestor cancelCtx; we can also cancel downward along the children relationships.

go-context-tree-construction.pnggo-context-tree-construction.png

Of course, all Contexts in the figure are system Contexts from the Go package, without depicting third-party Contexts. The actual code is slightly harder to understand because it adds handling logic for third-party Contexts. The key to distinguishing system Context implementations from user-defined Contexts is whether they implement the canceler interface.

If a third-party Context implements this interface, it can be organized into the tree, and when the upstream cancelCtx is canceled, it recursively calls the children’s cancel for cascading cancellation. Otherwise, it can only start a goroutine for each third-party Context to listen for upstream cancellation events, in order to cancel the third-party Context.

Cascading Cancellation

Below is the implementation of the key function cancelCtx.cancel in cascading cancellation. When this cancelCtx is canceled, it needs to cascade-cancel all Contexts in the Context tree rooted at this cancelCtx, and remove the root cancelCtx from its parent node, so that the GC can reclaim resources of all nodes in the cancelCtx subtree.

cancelCtx.cancel is an unexported function and cannot be called outside the context package, so inner goroutines holding the Context cannot cancel themselves; cancellation must be done via the returned CancelFunc (a simple wrapper around cancelCtx.cancel), whose handle is generally held by the outer goroutine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
func (c *cancelCtx) cancel(removeFromParent bool, err error) {
if err == nil { // A cancellation reason must be provided: Canceled or DeadlineExceeded
panic("context: internal error: missing cancel error")
}

c.mu.Lock()
if c.err != nil {
c.mu.Unlock()
return // Already canceled by another goroutine
}

// Record the error and close done
c.err = err
if c.done == nil {
c.done = closedchan
} else {
close(c.done)
}

// Cascading cancellation
for child := range c.children {
// NOTE: holding the parent Context's lock while acquiring the child Context's lock
child.cancel(false, err)
}
c.children = nil
c.mu.Unlock()

// Only the subtree root needs to be removed; other nodes in the subtree do not
if removeFromParent {
removeChild(c.Context, c)
}
}

timerCtx

timerCtx adds a timer on top of the embedded cancelCtx, canceling automatically when the user-set deadline is reached.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
type timerCtx struct {
cancelCtx
timer *time.Timer // Under cancelCtx.mu

deadline time.Time
}

func (c *timerCtx) Deadline() (deadline time.Time, ok bool) {
return c.deadline, true
}

func (c *timerCtx) cancel(removeFromParent bool, err error) {
// Cascade-cancel all Contexts in the subtree
c.cancelCtx.cancel(false, err)

if removeFromParent {
// Call separately to remove this node, because we are removing c, not c.cancelCtx
removeChild(c.cancelCtx.Context, c)
}

// Stop the timer
c.mu.Lock()
if c.timer != nil {
c.timer.Stop()
c.timer = nil
}
c.mu.Unlock()
}

Setting timeout cancellation is done in context.WithDeadline(). If an ancestor node’s deadline is earlier than this node’s, a cancelCtx is sufficient, because the ancestor will cascade-cancel it when its deadline is reached.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
if cur, ok := parent.Deadline(); ok && cur.Before(d) {
// Ancestor node's deadline is earlier
return WithCancel(parent)
}

c := &timerCtx{
cancelCtx: newCancelCtx(parent), // Use a new cancelCtx to implement partial cancel functionality
deadline: d,
}
propagateCancel(parent, c) // Build the Context cancellation tree, note c is passed, not c.cancelCtx
dur := time.Until(d) // Check if the deadline is so close that it has already passed
if dur <= 0 {
c.cancel(true, DeadlineExceeded)
return c, func() { c.cancel(false, Canceled) }
}

// Set timeout cancellation
c.mu.Lock()
defer c.mu.Unlock()
if c.err == nil {
c.timer = time.AfterFunc(dur, func() {
c.cancel(true, DeadlineExceeded)
})
}
return c, func() { c.cancel(true, Canceled) }
}

Using Context

Child goroutines using Context must ensure that they exit promptly and release resources when the Context is closed. That is, using Context requires following this principle to guarantee the effect of resource release during cascading cancellation. Therefore, Context is essentially a tree-like signal distribution mechanism; the Context tree can be used to track the process call tree, and when the outer process is canceled, Context cascades to notify all called processes.

The following is a typical code snippet for a child goroutine checking the Context to determine whether it needs to exit:

1
2
3
4
5
6
7
8
9
for ; ; time.Sleep(time.Second) {
select {
case <-context.Done():
return
default:
}

// Some long-running operations
}

As can be seen, the Context interface itself does not have a Cancel method, which is consistent with the fact that the channel returned by Done() is read-only: the sender and receiver of the Context close signal are usually not in the same function. For example, when the parent goroutine starts some child goroutines to do work, only the parent goroutine can close the done channel, and the child goroutines detect the channel close signal. That is, the child goroutine cannot cancel the Context passed from the parent goroutine.

Context Best Practices

There are some usage practices to follow when using Context:

  1. Context is usually the first parameter in a function.
  2. Do not store Context in a struct; pass Context explicitly in each function. However, in practice, you can flexibly combine based on the struct’s lifecycle.
  3. Do not use a nil Context, even though it is syntactically allowed. When unsure what value to use, you can use context.TODO().
  4. Context values are meant for sharing data across the request lifecycle, not as a way to pass extra parameters in functions. Because this is an implicit semantics that can easily cause bugs; if you want to pass extra parameters, you should still explicitly declare them in the function.
  5. Context is immutable, therefore thread-safe, and can be passed and used by the same Context across multiple goroutines.

Notes

[1] In this article, “process” refers to compute-intensive or IO-intensive long-running functions, or goroutines.

[2] A Context’s Done Channel refers to the channel returned by context.Done(). It is the key data structure within the Context, serving as the channel for communication between different processes. When termination is needed, the parent process sends a signal to this channel, and the child process reads the signal from this channel, performs cleanup, and exits.

References

  1. go doc context: https://golang.org/pkg/context/
  2. code review comments: https://github.com/golang/go/wiki/CodeReviewComments#contexts
  3. go blog context: https://blog.golang.org/context
  4. go context source code: https://golang.org/src/context/context.go?s=8419:8483#L222
  5. Go Language Design and Implementation: https://draveness.me/golang/docs/part3-runtime/ch06-concurrency/golang-context/

我是青藤木鸟,一个喜欢摄影、专注大规模数据系统的程序员,欢迎关注我的公众号:“木鸟杂记”,有更多的分布式系统、存储和数据库相关的文章,欢迎关注。 关注公众号后,回复“资料”可以获取我总结一份分布式数据库学习资料。 回复“优惠券”可以获取我的大规模数据系统付费专栏《系统日知录》的八折优惠券。

我们还有相关的分布式系统和数据库的群,可以添加我的微信号:qtmuniao,我拉你入群。加我时记得备注:“分布式系统群”。 另外,如果你不想加群,还有一个分布式系统和数据库的论坛(点这里),欢迎来玩耍。

wx-distributed-system-s.jpg