python-default-parameter.png
Introduction
After falling into the “pit” of Python’s default parameters several times, I decided to write a dedicated blog post about it. But recently I came across a great English article (Default Parameter Values in Python, Fredrik Lundh | July 17, 2008 | based on a comp.lang.python post), which is incisive and to the point. Since a gem already exists, there’s no need to show off my own writing. Of course, this is also a bit of laziness — here is a simple translation, hoping more people can see it.
The following is a translation, somewhat free, with some personal additions, not strictly consistent with the original text. Grammatical features are based on Python3.
Author: Muniao’s Notes https://www.qtmuniao.com, please indicate the source when reposting
Main Text
The way Python handles default parameter values is one of the few issues that can trip up most beginners (though usually only once).
Python’s perplexing behavior often occurs because you used a “mutable” object as a function’s default parameter. That is, an object that can be changed in place, such as a list or dictionary.
An example:
1 | def function(data=[]): |
As shown in the code, the returned list gets longer and longer, instead of being [1] every time as one might imagine. Try checking the ID of the returned list each time, and you’ll find it hasn’t changed at all.
1 | id(function()) |
The reason is simple: the function() function has been using the same list object across different function calls. Our modification (data.append(1)) became a sticky operation.
Why Does This Happen
The answer is: default parameter statements are always evaluated when the function is defined with the def keyword, and only executed once. You can refer to the relevant chapter in The Python Language Reference:
https://docs.python.org/zh-cn/3.7/reference/compound_stmts.html#function-definitions
Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once when the function is defined, and the same “precomputed” value is used for each call.
Note that the function signature starting with the def keyword is an executable statement in Python, and default parameters are evaluated in the def expression. If you execute the def expression multiple times, Python will create a new function object for you each time (and the default parameters will naturally be re-evaluated). We will see this in the following examples.
So What Should We Do
A temporary workaround, as others have also mentioned: use a meaningless value as the default parameter only as a placeholder, rather than directly modifying the default parameter every time. None is such a commonly used placeholder:
1 | def myfunc(value=None): |
If you need to handle arbitrary types of data (including None), you can use a sentinel instance:
1 | sentinel = object() |
Of course, in some old code, before object was introduced into Python, the following statement was also commonly used to create a unique instance with a non-false value:
1 | sentinel = ['placeholder'] |
Because [] creates a new instance every time it is executed.
Proper Ways to Leverage This
It’s worth mentioning that some advanced Python code often deliberately takes advantage of this feature. For example, if you want to create a bunch of buttons through a loop, you might do this:
1 | for i in range(10): |
But unfortunately discover that all callback functions print the same value (in the above example, most likely 9). The reason is that in Python’s inner nested scope, it binds to the outer variable itself, not its value. Therefore all callback functions will see the final value of variable i. This problem can be solved by explicitly passing the parameter when the inner function is called.
1 | for i in range(10): |
The i=i statement takes advantage of the fact that the def statement rebinds every time it is executed, binding the current value of the outer i to the local variable (i.e., the formal parameter) i.
There are two other possible uses. One is result caching/memoization:
1 | def calculate(a, b, c, memo={}): |
This usage is very useful in certain recursive functions (such as memoized search).
Second, for code that needs high optimization, you can bind global variables to local ones to optimize performance:
1 | import math |
A Detailed Explanation of the Principle
When Python executes a def expression (i.e., a function definition), it uses some existing environment fragments (such as the compiled function body code, corresponding to __code__; the current namespace environment, corresponding to __globals__) to construct a new function object. When Python does this, it also evaluates the default parameters and stores them as an attribute in the function object.
Of course, these environments can all be accessed through the function object’s attributes:
1 | function.__name__ |
Since you can access the default values, you can of course modify them:
1 | function.__defaults__[0][:] = [] |
However, you’d better not do this (modifying things you don’t understand, such as private variables or system variables, will lead to some magical consequences).
Another way to reset the default parameters is to re-execute the same def function definition statement, that is, execute the function definition again. When you do this, Python will re-create a code object for the compiled function body, re-evaluate the default parameters, and then bind the function object to the name function once again. However, to emphasize again, only do this when you clearly know what consequences a certain way of writing will produce.
Of course, you can also define your own function objects through the function class in the new module (though in Python3, the new module has been deprecated)
Summary
The root of everything is that Python is a dynamic language. When it defines a function, it also performs a binding from a name to a function object, just like defining an ordinary variable. And it only executes the assignment statement in the function header at the time of binding, and saves the parameters as part of the function object (i.e., its attributes). Afterwards, when calling the function through that name, it only executes the statements in the function body (the code fragment pointed to by __code__).
In static languages where functions are not first-class citizens, function definitions are done at the compilation stage and cannot be repeatedly bound multiple times at runtime. During each function call, formal parameters and actual parameters are combined once, and default parameters are reassigned.
