During EuroPython 2012, after my training and talks, I really needed to do some coding, so I started hacking on a ’practical’ application of the AST – making (some) constants faster than local variables, instead of slower by inlining them on import.
To do this, I use a custom importer on meta_path to intercept the import, and find any assignments look like constants – that is, any module-level variable in ALL_CAPS that is a simple string or a number.
I store those names and values, and then simply replace any attempt to load the name with the stored value instead. (Yes – this can go wrong in many horrible ways!)
For those who don’t know the AST – Abstract Syntax Tree – is a tree representation of the syntactic structure of your program. Python lets you look at and modify the AST before compiling it, which in effect allows you to rewrite the very structure of a program long before it runs. For a good introduction to the AST I heartily recommend What would you do with an AST by Matthew Desmarais.
To do this, I first need to intercept the import – the importer itself is not very interesting, but what it does is that it tries to find the source file for any imported module (if, for example, a .pyc file is found, it simply strips the ’c’ and tries to load the file.
With the source file found and read, the importer just calls the transformer, compiles the result, and sticks it into sys.modules:
module = types.ModuleType(name) inlined = transform(src) code = compile(inlined, filename, 'exec') sys.modules[name] = module exec(code, module.__dict__)
The transform method parses the source, creates the NodeTransformer that will modify the AST, and passes the parsed AST to it.
def transform(src): """Transforms the given source and return the AST""" tree = ast.parse(src) cm = ConstantMaker() newtree = cm.visit(tree) return newtree
Our NodeTransformer is equally simple, and overloads visit_Module (to find the constants), and visit_Name (to replace uses of the names with the value). visit_Module starts with building a list of all assignments in the module body, and then filters out assignments that fulfill our criteria for constants: they should be numbers or strings, and they should be named in ALL_CAPS. Any such assignments are stored in a name->value map that can then be used by visit_Name.
def visit_Module(self, node): """Find eglible variables to be inlined and store the Name->value mapping in self._constants for later use""" assigns = [x for x in node.body if type(x) == _ast.Assign] for assign in assigns: if type(assign.value) in (_ast.Num, _ast.Str): for name in assign.targets: if RE_CONSTANT.match(name.id): self._constants[name.id] = assign.value return self.generic_visit(node)
The parsing of assignments must be done before the call to generic_visit, or we’ll not have the mapping until after the rest of the module has already been visited. The mapping makes the work of visit_Name extremely simple:
def visit_Name(self, node): """If node.id is in self._constants, replace the loading of the node with the actual value""" return self._constants.get(node.id, node)
And that’s all we need to do! A simple (simplistic?) benchmark shows that it works as expected for simple cases – given the following source that mixes constant access with some other ’work’:
ONE = 1 TWO = "two" THREE = 3 FOUR = 4.0 def costly(iterations): tempstr = "" obj = MyClass() for i in range(iterations): tempstr += ONE * TWO * THREE + str(i) obj.change() tempstr += str(obj.value) eturn tempstr class MyClass(object): def __init__(self): self.value = random.random()*THREE def change(self): self.value += random.random()*FOUR
…a transformed version runs 15-20% faster than the untransformed version. Of course, my first benchmark which did only loading of constants was a bazillion times faster, but also not very interesting.
This is of course a very limited implementation – a ’proper’ implementation would have to prevent writing to constants (right now writes will be silently ignored by code in the current module), in-module writes to a constant should be detected, the transform should fallback to return the untransformed tree if it fails, and maybe, just maybe, it’s just not a very good idea at all.
It was, however, great fun to write! The code is available at the blog repository – the timer I use for the benchmarks is written by @BaltoRouberol
Next experiment will be inlining of functions, I think, or maybe lazy evaluation of function parameters.
0 kommentarer