Computer Science

Here you'll find some information about programming languages and software development in general.

Code blocks in Python

Posted by mtomassoli on April 20, 2012

Introduction

As anyone knows, Python doesn’t support code block objects: you can’t create and pass around anonymous blocks of code. That’s a pity and in fact many coders have suggested workarounds. Some tried to introduce proper code blocks by manipulating byte codes, while others proposed to use lambdas with concatenated expressions so as to emulate multi-statement blocks.

Unfortunately, these attempts, while very interesting and clever, can’t be used in production code because of their many limitations and oddities. I tried many approaches myself, but none was perfect: one didn’t support ‘nonlocal’, the other was slow, and so on…

Finally, I went for source code rewriting and I’m quite satisfied with the result.

Here are the main features:

  • code blocks act as normal functions/closures (you can use return, yield, global and nonlocal inside of them)
  • the source code is rewritten, on-the-fly, in RAM: no files are created or modified
  • syntax errors related to the new syntax generate meaningful error messages
  • error messages refer to your code and not the module codeblocks
  • debuggers break on your code as it should be
  • stepping and tracing through your code behave as expected
  • codeblocks doesn’t mess with the importation process
  • codeblocks doesn’t manipulate byte code and should be quite portable
  • codeblocks works with both Python 2.7 and Python 3.2 (but nonlocal is not supported in Python 2.7)

Repository

You can download (and contribute to, if you wish) this module from bitbucket.

The docstring of the module offers a more technical documentation and the exact rewriting rules applied to the original code.

A little example

def each(iterable, block):
    for e in iterable:
        block(e)                        # step into -> 6:

with each(range(0, 10)) << 'x':
    print('element ' + str(x))

That with statement is a special statement that

  1. creates an anonymous function anon_func which takes an argument x and has line 6 as its body,
  2. passes anon_func as a positional argument to the function each,
  3. executes each with all its arguments (the one given at line 5 and the block)

If you issue the step into command when at line 3, you’ll immediately jump to line 6. The same way, step into will take you from line 5 to line 2. Any other command (step over, step out, …) will also work as expected.

Usage

Here’s the previous example with all the ugly stuff included:

import codeblocks

codeblocks.rewrite()

def each(iterable, block):
    for e in iterable:
        block(e)                        # step into -> 6:

with each(range(0, 10)) << 'x':
    print('element ' + str(x))

That’s all. Just make sure that you import codeblocks and call rewrite right at the start of your code. What really happens is that rewrite rewrites the code, executes it and quits. This means that you should put a breakpoint over a line of code which follows your invocation of rewrite. If you steeped into rewrite you’d come across a call to exec. Stepping into it would also work, but setting a breakpoint is the recommended way.

If you want to use codeblocks in a module of yours, you’ll need to call end_of_module at the end of your module. If you forget it, an exception will be raised during the rewriting. The code above, if in a module, would thus look as follows:

import codeblocks

codeblocks.rewrite()

def each(iterable, block):
    for e in iterable:
        block(e)                        # step into -> 6:

with each(range(0, 10)) << 'x':
    print('element ' + str(x))

codeblocks.end_of_module()

Nothing too invasive, I hope.

Creating a block

Unfortunately, Python syntax doesn’t let us create blocks as part of normal expressions: we need to use a statement which already takes a block of code. I chose with over for, if, etc…

Here’s the fastest way to create a block and assigning it to a variable:

def id(x): return x

with my_block << id() << 'x, y = 3':
    print(x, y)

my_block(4)
my_block(1, 2)
print(my_block)

which prints

4 3
1 2
<function codeblock_1b4c94d3_1 at 0x02E7AC90>

I could’ve added special syntax for this very case, but I don’t think it’s worth it.

Bug in Python

I’ve just discovered a bug in Python. A patch is already available, but your version could still suffer from this bug.

Here’s the bug. Create a file script.py with this content:

with open('test') as f:
	data = f.read()
with open('test') as f:
	data = f.read()

Now start debugging it with a debugger of your choice and, from the first line, jump to line 3. If it crashes, the bug is still there.

The easier way to see this is by using pdb. Run it with python –m pdb script.py and, once in the debugger, issue the command j 3.

This bug is triggered only if you use with statements at the module level in imported modules which use codeblocks. Because this is not too big a limitation, I’ve decided to insert a bug check in codeblocks which raise an exception (with a meaningful error message) if the user write code that would trigger this bug. When the bug is finally gone from Python official releases, I’ll remove the bug check for patched versions.

Let’s move on.

Pseudo-keywords

When Python doesn’t let you use a keyword inside a with statement, you can use a pseudo-keyword.

Here’s the short list of keywords and corresponding pseudo-keywords:

keyword

pseudo-keyword

global identifiers _global(identifiers)
nonlocal identifiers _nonlocal(identifiers)
return expression _return(expression)
yield expression _yield(expression)

Passing a block as a positional argument

To pass a block as a positional argument to a function you use this syntax:

with each(range(0, 10)) << 'x':
    print('element ' + str(x))

where each is called as each(range(0, 10), block).

Passing a block as a keyword argument

To pass the same block as a keyword argument of name block_name, you use a similar syntax:

with each(range(0, 10)) << block_name << 'x':
    print('element ' + str(x))

or also

with each(range(0, 10)) << 'block_name' << 'x':
    print('element ' + str(x))

This calls each as each(range(0, 10), block_name = block).

Notice that block_name can be a literal or an identifier, but the arguments for the block (just x in this example) must be a literal. There are two reasons for this:

  1. It’s harder for a normal with to be misidentified as a special with and then erroneously rewritten.
  2. Default-valued arguments aren’t allowed by Python’s syntax so a literal is needed in that case

All the other parameters (such as block_name, in the example) can be both literals and identifiers.

Passing a block which takes no arguments

If the block takes no arguments, you should use the empty literal:

with each(range(0, 10)) << block_name << '':
    print('element ' + str(1))

If you don’t like it, you can use _ (underscore) or None:

with each(range(0, 10)) << block_name << _
    print('element ' + str(1))

with each(range(0, 10)) << block_name << None
    print('element ' + str(1))

Passing more than one block to a function

If a function takes more than one block, you need to use a multi with:

def take3(block1, block2, block3):
    for b in (block1, block2, block3):
        print('=== {} ===\n{}'.format(b, b()))

with take3() << ':multi':
    with '':                    # 1st pos arg
        _return("I'm block 1!")
    with block3 << '':
        _return("I'm block 3!")
    with '':                    # 2nd pos arg
        _return("I'm block 2!")

This prints

=== <function codeblock_1b4c94d3_6 at 0x02E5A930> ===
I'm block 1!
=== <function codeblock_1b4c94d3_8 at 0x02E5AE40> ===
I'm block 2!
=== <function codeblock_1b4c94d3_7 at 0x02E5AA98> ===
I'm block 3!

The last three with statements are internal withs and have a simpler syntax: they can only take an optional block_name and a required code_args, i.e. the arguments on the right of functions in non-internal withs.

In the example above, take3 is called as take3(anon_block1, anon_block2, block3 = anon_block3).

Note that the keywords can be arbitrary names in some circumstances:

def take3(**kwargs):
    for name in kwargs:
        print('=== {} ===\n{}'.format(name, kwargs[name]()))

with take3() << ':multi':
    with 'num #1' << '':             # arbitrary strings!
        _return("I'm block 1!")
    with 'num #3' << '':
        _return("I'm block 3!")
    with 'num #2' << '':
        _return("I'm block 2!")

This prints

=== num #3 ===
I'm block 3!
=== num #2 ===
I'm block 2!
=== num #1 ===
I'm block 1!

Passing one or more blocks to a function as a list

You can also pass one or more blocks to a function as a single list:

def take3(title, blocks):
    print(title)
    for b in blocks:
        print('    ' + b())

with take3('List of blocks:') << ':list':
    with '':
        _return("I'm block 1!")
    with '':
        _return("I'm block 3!")
    with '':
        _return("I'm block 2!")

This prints

List of blocks:
    I'm block 1!
    I'm block 3!
    I'm block 2!

Internal withs can take only one argument, in this case. This is what happens if you give one of them two arguments:

  File "C:\...\test.py", line 124
    with wrong_arg << '':             # arbitrary strings!
         ^
SyntaxError: :list forbids internal Withs with codekw args.

Passing one or more blocks to a function as a dictionary

You can pass one or more blocks to a function as a single dictionary as well:

def take3(dict, silent = True):
    for key in sorted(dict.keys()):
        print('=== {} ===\n{}'.format(key, dict[key]()))

with take3(silent = False) << ':dict':
    with 'block 1' << '':
        _return("I'm block 1!")
    with 'block 3' << '':
        _return("I'm block 3!")
    with 'block 2' << '':
        _return("I'm block 2!")

This prints:

=== block 1 ===
I'm block 1!
=== block 2 ===
I'm block 2!
=== block 3 ===
I'm block 3!

Here’s another example:

import logging
import random

logging.basicConfig(level = logging.DEBUG)

with operations << dict() << ':dict':
    with add << 'x, y':
        logging.debug('doing {} + {}'.format(x, y))
        _return(x + y)
    with sub << 'x, y':
        logging.debug('doing {} - {}'.format(x, y))
        _return(x - y)
    with mul << 'x, y':
        logging.debug('doing {} * {}'.format(x, y))
        _return(x * y)
    with div << 'x, y':
        logging.debug('doing {} / {}'.format(x, y))
        _return(x / y)

opnd1 = random.randint(1, 100)
opnd2 = random.randint(1, 100)
op = random.choice(('add', 'sub', 'mul', 'div'))
print(operations[op](opnd1, opnd2))

Here are a few runs:

DEBUG:root:doing 10 / 87
0.11494252873563218

DEBUG:root:doing 34 + 50
84

DEBUG:root:doing 67 / 15
4.466666666666667

DEBUG:root:doing 5 - 51
-46

DEBUG:root:doing 10 * 52
520

You got the idea.

The result of the function in a with can be assigned to variables

You can assign the result of the function called in a with statement, but you can’t just use the assignment operator:

def take3(list, silent = True):
    for i in range(len(list)):
        print('=== {} ===\n{}'.format('block ' + str(i + 1), list[i]()))
    return list[0](), [list[1](), list[2]()]           # just a test

with (r1, [r2, r3]) << take3(silent = False) << list << ':list':
    with '':
        _return("I'm block 1!")
    with '':
        _return("I'm block 2!")
    with '':
        _return("I'm block 3!")

print('---')
print(r1, r2, r3)

This prints

=== block 1 ===
I'm block 1!
=== block 2 ===
I'm block 2!
=== block 3 ===
I'm block 3!
---
I'm block 1! I'm block 2! I'm block 3!

With statements can be nested

Of course they can be nested! Here we go:

import re
import random

def take3(dict, silent = True):
    for key in sorted(dict.keys()):
        print('=== {} ===\n{}'.format(key, dict[key]()))

with take3(silent = False) << ':dict':
    with 'block 1' << '':
        _return("I'm block 1!")
    with 'block 3' << '':
        text = "I'm surely codeblock number three!"
        with ris << re.sub(r'(\w)(\w+)(\w)', string = text) << repl << 'm':
            # From Python's doc.
            inner_word = list(m.group(2))
            random.shuffle(inner_word)
            _return (m.group(1) + "".join(inner_word) + m.group(3))
        _return (ris)
    with 'block 2' << '':
        _return("I'm block 2!")

Here are a few runs:

=== block 1 ===
I'm block 1!
=== block 2 ===
I'm block 2!
=== block 3 ===
I'm seurly cbooledck nmuebr three!

=== block 1 ===
I'm block 1!
=== block 2 ===
I'm block 2!
=== block 3 ===
I'm srluey clbceoodk nubmer terhe!

=== block 1 ===
I'm block 1!
=== block 2 ===
I'm block 2!
=== block 3 ===
I'm srluey cbodoelck nmeubr there!

Code blocks allow global and nonlocal declarations

Because code blocks are implemented as real functions, you can use global and nonlocal in the usual way:

my_var = None

def each(iterable, block):
	for e in iterable:
		block(e)

def gen_funcs():
	sum = 0
	def acc_elems(*elems):
		# You would never use it like this: this is just a test, as always!
		with each(elems) << 'x':
			global my_var
			nonlocal sum
			sum += x
			my_var = 'modified from inside the block'
	def get_sum():
		return sum
	return get_sum, acc_elems

get_sum, acc_elems = gen_funcs()

print('my_var: ', my_var)
print('sum:   ', get_sum())

acc_elems(1, 2, 3)
print('after acc_elems(1, 2, 3):')
print('    my_var: ', my_var)
print('    sum:   ', get_sum())

acc_elems(4, 5, 6)
print('after acc_elems(4, 5, 6):')
print('    my_var: ', my_var)
print('    sum:   ', get_sum())

This prints

my_var:  None
sum:    0
after acc_elems(1, 2, 3):
    my_var:  None
    sum:    6
after acc_elems(4, 5, 6):
    my_var:  None
    sum:    21

Code blocks allow yield

No surprise here as well:

def print_all(gen):
    for msg in gen(verb = True):
        print(msg)

with print_all() << 'n = 10, e = 3, verb = False':
    if verb:
        for i in range(n):
            _yield('{}^{} is {}'.format(i, e, i ** e))
    else:
        for i in range(n):
            _yield(i, i ** e)

This prints

0^3 is 0
1^3 is 1
2^3 is 8
3^3 is 27
4^3 is 64
5^3 is 125
6^3 is 216
7^3 is 343
8^3 is 512
9^3 is 729

That’s all!

As always, comments and constructive criticism is greatly appreciated.

About these ads

16 Responses to “Code blocks in Python”

  1. cpatti said

    This is exciting! I’ve been a Ruby fan forever, but a bunch of folks here are fairly hard core Pythonians, and the one thing about Python I couldn’t bear is the lack of blocks. Thanks for the great work!

  2. Ian Kelly said

    Is it still possible to use normal with statements with codeblocks? I use context managers all the time, so not having access to them would be a major irritation to me.

    Passing blocks as keyword arguments: Do I understand correctly block_name and ‘block_name’ are equivalent and both result in a keyword of ‘block_name’? What if the keyword I want to use is stored in a variable; is there any way to pass the keyword from the variable? I would have expected using block_name as an identifier to handle this case.

    While I appreciate the hard work you’ve put in here, I doubt I’ll use this in any case. The syntax hoops that need to be jumped through to write these code blocks make them much less readable than simply defining and passing functions, and only slightly more concise, and so I can’t really see myself preferring this approach.

    • mtomassoli said

      - It’s still possible to use normal with statements.
      That’s why I use ‘<<' and the last argument is a literal: there's less risk of confusion.

      - block_name and 'block_name' are equivalent. Yep. At the beginning, everything was a literal but I decided to relax this restriction a bit.
      Regarding your last question about identifiers, there's no technical difficulty in doing what you ask, but that's just a feature I didn't think of.
      I didn't want to complicate things too much, but hey, why not?

  3. Ian Kelly said

    Another thought: To assign the result of the function to a variable, why not use the “as” clause of the with statement? The example would then become:

    with take3(silent = False) << list << ':list' as (r1, [r2, r3]):

    which I think would be more readable than overloading the << operator for yet another purpose.

    • mtomassoli said

      I thought about as, but isn’t that less readable because of the order?
      To me, this is a flow:
      retVar << func() << 'x':
      block
      Anyway, I don't mind changing the syntax.

      Since my time is limited, unfortunately, I'll make changes to codeblocks if I see that there is enough interest and enough discussion about how it should be modified.
      That's why I put my module on bitbucket. It's the first time I use it, but surely there is a section where to discuss about features, bugs, etc… right?

  4. Paul said

    I don’t get it. What do you mean you can’t pass around blocks of code? Functions are proper objects in Python.

    def myfunc():
    statement
    statement
    statement

    x = myfunc

    Surely I’m missing something… I don’t have a ruby background, so I’m still de-tangling your syntax up above. But you can assign a function to a variable, pass that as input to another function, etc. What can’t you do that you need this for?

    • mtomassoli said

      You can’t pass around anonymous block of codes.
      The main problem is that when a function takes another function as an argument, we have two options in Python:
      - define a new one-time function
      - use lambda
      Unfortunately, lambdas are limited to expressions.

      With my module you can write, for instance:

          with ris << re.sub(r'(\w)(\w+)(\w)', string = text) << repl << 'm':
              inner_word = list(m.group(2))
              random.shuffle(inner_word)
              _return (m.group(1) + "".join(inner_word) + m.group(3))
      

      which is rewritten (almost) as

          def anon_func(m):
              inner_word = list(m.group(2))
              random.shuffle(inner_word)
              _return (m.group(1) + "".join(inner_word) + m.group(3))
      
          ris = re.sub(r'(\w)(\w+)(\w)', string = text, repl = anon_func)
      

      Basically, it lets you define a block of code in place.

      You can also create new control statements, if you want:

          def each_elem_in(my_container, func):
              if my_container.type is Matrix:
                  .....
                  if my_container.compressed:
                      ....
                      func(i, j, elem_val)
                  else:
                      ....
                      func(i, j, elem_val)
              elif my_container.type is Vector:
                  .....
                  func(i, elem_val)
              else:
                  raise WrongContainer("elem_in: invalid container.")
      

      You would use it like this:

          with each_elem_in(my_matrix) << 'i, j, val':
              print('This is element ({}, {}) of value {}'.format(i, j, val))
      
          with each_elem_in(my_vector) << 'i, val':
              print('This is element {} of value {}'.format(i, val))
      
      • Ramit Prasad said

        Not having learned Ruby yet, I do not understand the desire for anonymous code blocks. You use function passing as a problem because you have to define a one time function…but isn’t that exactly what you are doing when you define a code block? What is the difference in terms of implementation and in terms of me creating a new function and a new code block?

      • mtomassoli said

        Consider these two requests:
        A) John, would you look after my dog while I’m away?
        B) dog_task = look after my dog while I’m away
        John, would you dog_task?
        Code blocks remove a level of unneeded indirection.
        After all, why were lambdas added to Python?

      • Ramit Prasad said

        I’m not sure rewriting the code and using non-pythonic syntax for creating an “anonymous” function is really removing “a level of unneeded indirection”.

        Pretty cool and impressive proof of concept. If you could clean up the syntax maybe you can get it in the fabled Python 4k.

  5. Éric Araujo said

    Can you tell more about the Python bug you’re mentioning? Do you know if it’s already reported at http://bugs.python.org/ ?

  6. Great module and a nice idea.

    How about a pypi release?

    Also some of your method calls should really be made PEP8 compliant for readable, e.g. .end_module() is much nicer than the camelcase .endOfModule()

    • mtomassoli said

      I thought I eradicated camelCase completely from my module. Maybe I didn’t update the article…

      edit: Yes, I forgot to update the article. I fixed it. Thank you for letting me know!

      Regarding pypi, I’ll look into it (I’m kinda new to Python).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: