Modern Python Cookbook
上QQ阅读APP看书,第一时间看更新

Writing long lines of code

There are many times when we need to write lines of code that are so long that they're very hard to read. Many people like to limit the length of a line of code to 80 characters or fewer. It's a well-known principle of graphic design that a narrower line is easier to read. See http://webtypography.net/2.1.2 for a deeper discussion of line width and readability.

While shorter lines are easier on the eyes, our code can refuse to cooperate with this principle. Long statements are a common problem. How can we break long Python statements into more manageable pieces?

Getting ready

Often, we'll have a statement that's awkwardly long and hard to work with. Let's say we've got something like this:

>>> import math
>>> example_value = (63/25) * (17+15*math.sqrt(5)) / (7+15*math.sqrt(5))
>>> mantissa_fraction, exponent = math.frexp(example_value)
>>> mantissa_whole = int(mantissa_fraction*2**53)
>>> message_text = f'the internal representation is {mantissa_whole:d}/2**53*2**{exponent:d}'
>>> print(message_text)
the internal representation is 7074237752514592/2**53*2**2

This code includes a long formula, and a long format string into which we're injecting values. This looks bad when typeset in a book; the f-string line may be broken incorrectly. It looks bad on our screen when trying to edit this script.

We can't haphazardly break Python statements into chunks. The syntax rules are clear that a statement must be complete on a single logical line.

The term "logical line" provides a hint as to how we can proceed. Python makes a distinction between logical lines and physical lines; we'll leverage these syntax rules to break up long statements.

How to do it...

Python gives us several ways to wrap long statements so they're more readable:

  • We can use \ at the end of a line to continue onto the next line.
  • We can leverage Python's rule that a statement can span multiple logical lines because the (), [], and {} characters must balance. In addition to using () or \, we can also exploit the way Python automatically concatenates adjacent string literals to make a single, longer literal; ("a" "b") is the same as "ab".
  • In some cases, we can decompose a statement by assigning intermediate results to separate variables.

We'll look at each one of these in separate parts of this recipe.

Using a backslash to break a long statement into logical lines

Here's the context for this technique:

>>> import math
>>> example_value = (63/25) * (17+15*math.sqrt(5)) / (7+15*math.sqrt(5))
>>> mantissa_fraction, exponent = math.frexp(example_value)
>>> mantissa_whole = int(mantissa_fraction*2**53)

Python allows us to use \ to break the logical line into two physical lines:

  1. Write the whole statement on one long line, even if it's confusing:
    >>> message_text = f'the internal representation is {mantissa_whole:d}/2**53*2**{exponent:d}'
    
  2. If there's a meaningful break, insert the \ to separate the statement:
    >>> message_text = f'the internal representation is \
    ...     {mantissa_whole:d}/2**53*2**{exponent:d}'
    

For this to work, the \ must be the last character on the line. We can't even have a single space after the \. An extra space is fairly hard to see; for this reason, we don't encourage using back-slash continuation like this. PEP 8 provides guidelines on formatting and discourages this.

In spite of this being a little hard to see, the \ can always be used. Think of it as the last resort in making a line of code more readable.

Using the () characters to break a long statement into sensible pieces

  1. Write the whole statement on one line, even if it's confusing:
    >>> import math
    >>> example_value1 = (63/25) * (17+15*math.sqrt(5)) / (7+15*math.sqrt(5))
    
  2. Add the extra () characters, which don't change the value, but allow breaking the expression into multiple lines:
    >>> example_value2 = (63/25) * ( (17+15*math.sqrt(5)) / (7+15*math.sqrt(5)) )
    >>> example_value2 == example_value1
    True
    
  3. Break the line inside the () characters:
    >>> example_value3 = (63/25) * (
    ...      (17+15*math.sqrt(5))
    ...    / (7+15*math.sqrt(5))
    ... )
    >>> example_value3 == example_value1
    True
    

The matching () character's technique is quite powerful and will work in a wide variety of cases. This is widely used and highly recommended.

We can almost always find a way to add extra () characters to a statement. In rare cases when we can't add () characters, or adding () characters doesn't improve readability, we can fall back on using \ to break the statement into sections.

Using string literal concatenation

We can combine the () characters with another rule that joins adjacent string literals. This is particularly effective for long, complex format strings:

  1. Wrap a long string value in the () characters.
  2. Break the string into substrings:
    >>> message_text = (
    ...     f'the internal representation '
    ...     f'is {mantissa_whole:d}/2**53*2**{exponent:d}'
    ... )
    >>> message_text
    'the internal representation is 7074237752514592/2**53*2**2'
    

We can always break a long string into adjacent pieces. Generally, this is most effective when the pieces are surrounded by () characters. We can then use as many physical line breaks as we need. This is limited to those situations where we have particularly long string literals.

Assigning intermediate results to separate variables

Here's the context for this technique:

>>> import math
>>> example_value = (63/25) * (17+15*math.sqrt(5)) / (7+15*math.sqrt(5))

We can break this into three intermediate values:

  1. Identify sub-expressions in the overall expression. Assign these to variables:
    >>> a = (63/25)
    >>> b = (17+15*math.sqrt(5))
    >>> c = (7+15*math.sqrt(5))
    

    This is generally quite simple. It may require a little care to do the algebra to locate sensible sub-expressions.

  2. Replace the sub-expressions with the variables that were created:
    >>> example_value = a * b / c
    

We can always take a sub-expression and assign it to a variable, and use the variable everywhere the sub-expression was used. The 15*sqrt(5) product is repeated; this, too, is a good candidate for refactoring the expression.

We didn't give these variables descriptive names. In some cases, the sub-expressions have some semantics that we can capture with meaningful names. In this case, however, we chose short, arbitrary identifiers instead.

How it works...

The Python Language Manual makes a distinction between logical lines and physical lines. A logical line contains a complete statement. It can span multiple physical lines through techniques called line joining. The manual calls the techniques explicit line joining and implicit line joining.

The use of \ for explicit line joining is sometimes helpful. Because it's easy to overlook, it's not generally encouraged. PEP 8 suggests this should be the method of last resort.

The use of () for implicit line joining can be used in many cases. It often fits semantically with the structure of the expressions, so it is encouraged. We may have the () characters as a required syntax. For example, we already have () characters as part of the syntax for the print() function. We might do this to break up a long statement:

>>> print(
...    'several values including',
...    'mantissa =', mantissa,
...    'exponent =', exponent
... )

There's more...

Expressions are used widely in a number of Python statements. Any expression can have () characters added. This gives us a lot of flexibility.

There are, however, a few places where we may have a long statement that does not specifically involve an expression. The most notable example of this is the import statement—it can become long, but doesn't use any expressions that can be parenthesized. In spite of not having a proper expression, it does, however, still permit the use of (). The following example shows we can surround a very long list of imported names:

>>> from math import ( 
...    sin, cos, tan,
...    sqrt, log, frexp)

In this case, the () characters are emphatically not part of an expression. The () characters are available syntax, included to make the statement consistent with other statements.

See also

  • Implicit line joining also applies to the matching [] and {} characters. These apply to collection data structures that we'll look at in Chapter 4, Built-In Data Structures Part 1: Lists and Sets.