March 10, 2008

Approaching Python 3000: string formatting

One of the changes in Python 3000 applies to string formatting. PEP-3101 is the regulating document for the change. It basically says that the regular % operator is too limited and what we need is a powerful domain-specific language for string formatting.

To be frank, I never felt limited with what % had to offer, but there is no point in criticizing what's about to become standard. Let's see what's new:

1. What used to be a binary operator is now a str's method:
"{0}, {1}".format("A", 10) == "A, 10"
"{n} = {v}".format(n = "N", v = "V") == "N = V"
2. Formatting and alignment is applied in the same manner as before:
"{0:03d}".format(10) == "010"
"<{S:>5s}>".format(S = "foo") == "< foo>"
"<{S:<5s}>".format(S = "foo") == "<foo >"
3. Format can insert not just parameter values, but also their items and/or attributes:
d = dict(foo = 1, bar = "a")
"{0[foo]}, {0[bar]}".format(d) == "1, a"
"{0.__class__.__name__}".format(d) == "dict"
4. Recursive substitution is allowed to a degree, for example this works:
d = dict(value = 10, format = "03d")
"{0[value]:{0[format]}}".format(d) == "010"
but this doesn't:
d = dict(data = {"a": "A", "b": "B"}, key = "a")
"{0[data][{0[key]}]}".format(d)
5. Classes can control their own formatting:
class Foo():
def __format__(self, format):
from re import match
assert match("[0-9]+s", format)
return "x" * int(format[:-1])
foo = Foo()

"{0:3s}".format(foo) == "xxx"
"{0:10s}".format(foo) == "xxxxxxxxxx"
I agree, the new way of string formatting (1 and 2) is cleaner and more straightforward. Substituting items and attributes (3) could be useful sometimes. Custom formatting (5) is Pythonic all right, but hardly practically useful, except when building a framework, a class framework perhaps.

What I don't buy is the attempt to make a statement from what is supposed to be an expression (4). If it is inconsistent, difficult to read and not apparently useful, it should not be there.

No comments: