python-learn.png
Introduction
When I first learned about closures while studying JavaScript, I didn’t quite grasp the concept. For interviews, I memorized a vague “definition”: a function nested inside another function, with the inner function returned from the outer function, bringing the outer function’s environment along with it. At the time, I tried to understand closures from the literal meaning and some classic examples. The Chinese translation “闭包” (bìbāo, literally “closed package”) doesn’t easily convey the underlying principle, so I always felt a bit fuzzy about it. Recently, due to work requirements, I’ve been using Python and encountered closures again. This time, I came across some novel and interesting materials, which finally helped me connect several literal concepts (first-class functions, binding, scope, etc.) together and gain a deeper understanding of closures.
The reference materials are listed at the end; I highly recommend reading them.
Author: Muniao’s Notes https://www.qtmuniao.com, please indicate the source when reposting
Overview
Some English technical terms in computing, when translated literally, inevitably appear pale and awkward due to the lack of context. It takes much groundwork before one can appreciate the principles beneath the quirky surface. Closure is one such concept that involves a great deal of context, including the most fundamental concepts in programming languages: binding, environments, variable scope, and functions as first-class citizens.
Binding
In Python, binding is the most basic abstraction technique in a programming language. It binds a value to a variable, which can later be referenced or modified. Below are several levels of binding; each group of statements binds a name and its corresponding value to the environment where it is defined at runtime.
This includes binding a name to a block of memory via assignment statements. Of course, when a function is called, the binding of formal parameters to actual arguments is also a form of binding:
1 | In [1]: square = 4 |
Binding a name to a composite operation, i.e., function definition, using the def keyword:
1 | In [1]: def square(x): |
Binding a name to a data collection, i.e., class definition, using class:
1 | In [1]: class square: |
According to the execution order, multiple bindings with the same name will cause the later one to override the earlier one:
1 | In [1]: square = 3 |
All of these are called abstractions because they provide encapsulation of data, composite operations, or data collections—tying a name to complex data or logic so that the caller doesn’t need to care about the implementation details, and using them as basic building blocks for more complex projects. It can be said that binding is the cornerstone of programming.
Returning to the topic of this article, a closure is first and foremost a function, just a special kind of function. As for what makes it special, I’ll explain after introducing a few more concepts.
Scope
Scope, as the name suggests, is the range that a binding can cover, or how far you can access a variable. Each function definition constructs a local scope.
Python, like most programming languages, uses static scoping rules (also known as lexical scoping). When functions are nested, the inner function can access variables from the outer function. Therefore, you can imagine scope as a container that can be nested, with inner scopes extending outer scopes, and the outermost scope being the global scope.
In the previous section, I mentioned that multiple bindings with the same name will cause the later one to override the earlier one. This has an implicit premise: within the same scope. In nested scopes, the relationship is actually one of hiding. A variable definition in an inner function will shadow the definition of the same name in the outer function, but in the outer scope, the variable retains its original value:
1 | In [16]: a = 4 |
As you can see, scope can also be understood from another angle: in a given environment, when determining the value of a name binding, we search from the innermost scope outward, and the value corresponding to the first binding of that name we find is the value the name refers to.
It is worth emphasizing that nested function definitions cause nested scopes, or an environment extension relationship (the inner extends the outer). Class definitions are slightly different: a class definition introduces a new namespace. Namespace and scope are often compared, but I’ll leave that aside here; feel free to look it up if you’re interested.
Speaking of which, let me mention a commonly discussed example:
1 | In [50]: a = 4 |
One might expect the print statement above to output 4 or 5. Why does it throw an error? This is because when the test function is defined, the tokenizer scans all tokens within the function definition. Seeing the assignment statement a = 5, it determines that a is a local variable, so it won’t output 4. When execution reaches print(a), in the local environment, a has not yet been bound, hence the UnboundLocalError.
Digging a bit deeper: although Python is interpreted—meaning it reads, interprets, and executes one statement at a time—for code blocks (compound statements consisting of a header statement, a colon, and its associated indented block, such as function definitions, class definitions, loop statements, etc.), it still performs an overall scan first.
First-Class Functions
Generally, elements that make up a programming language, such as variables, functions, and classes, are subject to different restrictions. The elements with the fewest restrictions are called first-class citizens of that programming language. The most common privileges of first-class citizens are:
- Can be bound to a name
- Can be passed as an argument to a function
- Can be returned as the result of a function
- Can be contained within other data structures
Applying this to functions in Python: a function can be assigned to a variable, can be received and returned by a function, and can be defined inside another function (i.e., nested definition):
1 | In [32]: def test(): |
Not all languages treat functions as first-class citizens. For example, Java does not have any of the four privileges listed above.
Here, functions that operate on other functions (i.e., functions that take other functions as arguments or return them as values) are called higher-order functions. Higher-order functions greatly enhance the expressive power of a language, but at the same time, they are not easy to use well.
Stack Calls
Every function call creates a frame in the environment, performs some bindings within that frame, and pushes it onto the function call stack. When the function call ends, the frame is popped off, the bindings within it are released (i.e., garbage collected), and the local scope ceases to exist.
1 | In [47]: def test(): |
That is, after the call ends, the locally defined variable x cannot be accessed from outside. But as in the earlier example, the returned add function references the variable a from add_num, whose call has already ended. How do we explain this phenomenon? Remember the rule I mentioned before:
1 | When functions are nested, the environment of the inner function automatically extends the environment where it was defined |
Therefore, after the outer function returns, the returned inner function still maintains the extended environment from when it was defined. In other words, because the inner function holds a reference, all bindings in the outer function’s environment are not reclaimed.
Closure
At long last, it appears. But the climax has actually already passed.
A closure is built upon all the concepts discussed above. The example mentioned earlier:
1 | In [37]: def add_num(a): |
This is a closure. Picking up the foreshadowing from earlier, here is my understanding of a closure: it is a higher-order function in which the outer function (add_num in the example) returns the function it defines internally (add), and because the returned inner function extends the outer function’s environment, it holds a reference to it. When the returned inner function (add5) is called, it can reference the external environment from when it was defined (in this example, the value of a).
Conclusion
After all this, I have only explained what a closure is at a logical or abstract level, and which concepts it is entangled with. But none of this gets to the essence; it is still a castle in the air. If you want to truly understand closures, you can delve into Python’s interpretation and execution mechanism, but that falls into the realm of compiler theory.
References
- cs61a course materials: composing programs, which is the companion course for the book SICP. The book is a classic, the course is excellent, and the materials are fascinating—well worth a read.
- A nice article I found via Google: A Python Tutorial To Understanding Scopes and Closures
