Introduction
As anyone knows, Python doesn’t support code block objects: you can’t create and pass around anonymous blocks of code. That’s a pity and in fact many coders have suggested workarounds. Some tried to introduce proper code blocks by manipulating byte codes, while others proposed to use lambdas with concatenated expressions so as to emulate multi-statement blocks.
Unfortunately, these attempts, while very interesting and clever, can’t be used in production code because of their many limitations and oddities. I tried many approaches myself, but none was perfect: one didn’t support ‘nonlocal’, the other was slow, and so on…
Finally, I went for source code rewriting and I’m quite satisfied with the result.
Here are the main features:
- code blocks act as normal functions/closures (you can use return, yield, global and nonlocal inside of them)
- the source code is rewritten, on-the-fly, in RAM: no files are created or modified
- syntax errors related to the new syntax generate meaningful error messages
- error messages refer to your code and not the module codeblocks
- debuggers break on your code as it should be
- stepping and tracing through your code behave as expected
- codeblocks doesn’t mess with the importation process
- codeblocks doesn’t manipulate byte code and should be quite portable
- codeblocks works with both Python 2.7 and Python 3.2 (but nonlocal is not supported in Python 2.7)
Repository
You can download (and contribute to, if you wish) this module from bitbucket.
The docstring of the module offers a more technical documentation and the exact rewriting rules applied to the original code.
A little example
def each(iterable, block): for e in iterable: block(e) # step into -> 6: with each(range(0, 10)) << 'x': print('element ' + str(x))
That with statement is a special statement that
- creates an anonymous function anon_func which takes an argument x and has line 6 as its body,
- passes anon_func as a positional argument to the function each,
- executes each with all its arguments (the one given at line 5 and the block)
If you issue the step into command when at line 3, you’ll immediately jump to line 6. The same way, step into will take you from line 5 to line 2. Any other command (step over, step out, …) will also work as expected.
Usage
Here’s the previous example with all the ugly stuff included:
import codeblocks codeblocks.rewrite() def each(iterable, block): for e in iterable: block(e) # step into -> 6: with each(range(0, 10)) << 'x': print('element ' + str(x))
That’s all. Just make sure that you import codeblocks and call rewrite right at the start of your code. What really happens is that rewrite rewrites the code, executes it and quits. This means that you should put a breakpoint over a line of code which follows your invocation of rewrite. If you steeped into rewrite you’d come across a call to exec. Stepping into it would also work, but setting a breakpoint is the recommended way.
If you want to use codeblocks in a module of yours, you’ll need to call end_of_module at the end of your module. If you forget it, an exception will be raised during the rewriting. The code above, if in a module, would thus look as follows:
import codeblocks codeblocks.rewrite() def each(iterable, block): for e in iterable: block(e) # step into -> 6: with each(range(0, 10)) << 'x': print('element ' + str(x)) codeblocks.end_of_module()
Nothing too invasive, I hope.
Creating a block
Unfortunately, Python syntax doesn’t let us create blocks as part of normal expressions: we need to use a statement which already takes a block of code. I chose with over for, if, etc…
Here’s the fastest way to create a block and assigning it to a variable:
def id(x): return x with my_block << id() << 'x, y = 3': print(x, y) my_block(4) my_block(1, 2) print(my_block)
which prints
4 3 1 2 <function codeblock_1b4c94d3_1 at 0x02E7AC90>
I could’ve added special syntax for this very case, but I don’t think it’s worth it.
Bug in Python
I’ve just discovered a bug in Python. A patch is already available, but your version could still suffer from this bug.
Here’s the bug. Create a file script.py with this content:
with open('test') as f: data = f.read() with open('test') as f: data = f.read()
Now start debugging it with a debugger of your choice and, from the first line, jump to line 3. If it crashes, the bug is still there.
The easier way to see this is by using pdb. Run it with python –m pdb script.py and, once in the debugger, issue the command j 3.
This bug is triggered only if you use with statements at the module level in imported modules which use codeblocks. Because this is not too big a limitation, I’ve decided to insert a bug check in codeblocks which raise an exception (with a meaningful error message) if the user write code that would trigger this bug. When the bug is finally gone from Python official releases, I’ll remove the bug check for patched versions.
Let’s move on.
Pseudo-keywords
When Python doesn’t let you use a keyword inside a with statement, you can use a pseudo-keyword.
Here’s the short list of keywords and corresponding pseudo-keywords:
keyword |
pseudo-keyword |
global identifiers | _global(identifiers) |
nonlocal identifiers | _nonlocal(identifiers) |
return expression | _return(expression) |
yield expression | _yield(expression) |
Passing a block as a positional argument
To pass a block as a positional argument to a function you use this syntax:
with each(range(0, 10)) << 'x': print('element ' + str(x))
where each is called as each(range(0, 10), block).
Passing a block as a keyword argument
To pass the same block as a keyword argument of name block_name, you use a similar syntax:
with each(range(0, 10)) << block_name << 'x': print('element ' + str(x))
or also
with each(range(0, 10)) << 'block_name' << 'x': print('element ' + str(x))
This calls each as each(range(0, 10), block_name = block).
Notice that block_name can be a literal or an identifier, but the arguments for the block (just x in this example) must be a literal. There are two reasons for this:
- It’s harder for a normal with to be misidentified as a special with and then erroneously rewritten.
- Default-valued arguments aren’t allowed by Python’s syntax so a literal is needed in that case
All the other parameters (such as block_name, in the example) can be both literals and identifiers.
Passing a block which takes no arguments
If the block takes no arguments, you should use the empty literal:
with each(range(0, 10)) << block_name << '': print('element ' + str(1))
If you don’t like it, you can use _ (underscore) or None:
with each(range(0, 10)) << block_name << _ print('element ' + str(1))
with each(range(0, 10)) << block_name << None print('element ' + str(1))
Passing more than one block to a function
If a function takes more than one block, you need to use a multi with:
def take3(block1, block2, block3): for b in (block1, block2, block3): print('=== {} ===\n{}'.format(b, b())) with take3() << ':multi': with '': # 1st pos arg _return("I'm block 1!") with block3 << '': _return("I'm block 3!") with '': # 2nd pos arg _return("I'm block 2!")
This prints
=== <function codeblock_1b4c94d3_6 at 0x02E5A930> === I'm block 1! === <function codeblock_1b4c94d3_8 at 0x02E5AE40> === I'm block 2! === <function codeblock_1b4c94d3_7 at 0x02E5AA98> === I'm block 3!
The last three with statements are internal withs and have a simpler syntax: they can only take an optional block_name and a required code_args, i.e. the arguments on the right of functions in non-internal withs.
In the example above, take3 is called as take3(anon_block1, anon_block2, block3 = anon_block3).
Note that the keywords can be arbitrary names in some circumstances:
def take3(**kwargs): for name in kwargs: print('=== {} ===\n{}'.format(name, kwargs[name]())) with take3() << ':multi': with 'num #1' << '': # arbitrary strings! _return("I'm block 1!") with 'num #3' << '': _return("I'm block 3!") with 'num #2' << '': _return("I'm block 2!")
This prints
=== num #3 === I'm block 3! === num #2 === I'm block 2! === num #1 === I'm block 1!
Passing one or more blocks to a function as a list
You can also pass one or more blocks to a function as a single list:
def take3(title, blocks): print(title) for b in blocks: print(' ' + b()) with take3('List of blocks:') << ':list': with '': _return("I'm block 1!") with '': _return("I'm block 3!") with '': _return("I'm block 2!")
This prints
List of blocks: I'm block 1! I'm block 3! I'm block 2!
Internal withs can take only one argument, in this case. This is what happens if you give one of them two arguments:
File "C:\...\test.py", line 124 with wrong_arg << '': # arbitrary strings! ^ SyntaxError: :list forbids internal Withs with codekw args.
Passing one or more blocks to a function as a dictionary
You can pass one or more blocks to a function as a single dictionary as well:
def take3(dict, silent = True): for key in sorted(dict.keys()): print('=== {} ===\n{}'.format(key, dict[key]())) with take3(silent = False) << ':dict': with 'block 1' << '': _return("I'm block 1!") with 'block 3' << '': _return("I'm block 3!") with 'block 2' << '': _return("I'm block 2!")
This prints:
=== block 1 === I'm block 1! === block 2 === I'm block 2! === block 3 === I'm block 3!
Here’s another example:
import logging import random logging.basicConfig(level = logging.DEBUG) with operations << dict() << ':dict': with add << 'x, y': logging.debug('doing {} + {}'.format(x, y)) _return(x + y) with sub << 'x, y': logging.debug('doing {} - {}'.format(x, y)) _return(x - y) with mul << 'x, y': logging.debug('doing {} * {}'.format(x, y)) _return(x * y) with div << 'x, y': logging.debug('doing {} / {}'.format(x, y)) _return(x / y) opnd1 = random.randint(1, 100) opnd2 = random.randint(1, 100) op = random.choice(('add', 'sub', 'mul', 'div')) print(operations[op](opnd1, opnd2))
Here are a few runs:
DEBUG:root:doing 10 / 87 0.11494252873563218 DEBUG:root:doing 34 + 50 84 DEBUG:root:doing 67 / 15 4.466666666666667 DEBUG:root:doing 5 - 51 -46 DEBUG:root:doing 10 * 52 520
You got the idea.
The result of the function in a with can be assigned to variables
You can assign the result of the function called in a with statement, but you can’t just use the assignment operator:
def take3(list, silent = True): for i in range(len(list)): print('=== {} ===\n{}'.format('block ' + str(i + 1), list[i]())) return list[0](), [list[1](), list[2]()] # just a test with (r1, [r2, r3]) << take3(silent = False) << list << ':list': with '': _return("I'm block 1!") with '': _return("I'm block 2!") with '': _return("I'm block 3!") print('---') print(r1, r2, r3)
This prints
=== block 1 === I'm block 1! === block 2 === I'm block 2! === block 3 === I'm block 3! --- I'm block 1! I'm block 2! I'm block 3!
With statements can be nested
Of course they can be nested! Here we go:
import re import random def take3(dict, silent = True): for key in sorted(dict.keys()): print('=== {} ===\n{}'.format(key, dict[key]())) with take3(silent = False) << ':dict': with 'block 1' << '': _return("I'm block 1!") with 'block 3' << '': text = "I'm surely codeblock number three!" with ris << re.sub(r'(\w)(\w+)(\w)', string = text) << repl << 'm': # From Python's doc. inner_word = list(m.group(2)) random.shuffle(inner_word) _return (m.group(1) + "".join(inner_word) + m.group(3)) _return (ris) with 'block 2' << '': _return("I'm block 2!")
Here are a few runs:
=== block 1 === I'm block 1! === block 2 === I'm block 2! === block 3 === I'm seurly cbooledck nmuebr three! === block 1 === I'm block 1! === block 2 === I'm block 2! === block 3 === I'm srluey clbceoodk nubmer terhe! === block 1 === I'm block 1! === block 2 === I'm block 2! === block 3 === I'm srluey cbodoelck nmeubr there!
Code blocks allow global and nonlocal declarations
Because code blocks are implemented as real functions, you can use global and nonlocal in the usual way:
my_var = None def each(iterable, block): for e in iterable: block(e) def gen_funcs(): sum = 0 def acc_elems(*elems): # You would never use it like this: this is just a test, as always! with each(elems) << 'x': global my_var nonlocal sum sum += x my_var = 'modified from inside the block' def get_sum(): return sum return get_sum, acc_elems get_sum, acc_elems = gen_funcs() print('my_var: ', my_var) print('sum: ', get_sum()) acc_elems(1, 2, 3) print('after acc_elems(1, 2, 3):') print(' my_var: ', my_var) print(' sum: ', get_sum()) acc_elems(4, 5, 6) print('after acc_elems(4, 5, 6):') print(' my_var: ', my_var) print(' sum: ', get_sum())
This prints
my_var: None sum: 0 after acc_elems(1, 2, 3): my_var: None sum: 6 after acc_elems(4, 5, 6): my_var: None sum: 21
Code blocks allow yield
No surprise here as well:
def print_all(gen): for msg in gen(verb = True): print(msg) with print_all() << 'n = 10, e = 3, verb = False': if verb: for i in range(n): _yield('{}^{} is {}'.format(i, e, i ** e)) else: for i in range(n): _yield(i, i ** e)
This prints
0^3 is 0 1^3 is 1 2^3 is 8 3^3 is 27 4^3 is 64 5^3 is 125 6^3 is 216 7^3 is 343 8^3 is 512 9^3 is 729
That’s all!
As always, comments and constructive criticism is greatly appreciated.