Modern Python Cookbook
上QQ阅读APP看书,第一时间看更新

Managing a context using the with statement

There are many instances where our scripts will be entangled with external resources. The most common examples are disk files and network connections to external hosts. A common bug is retaining these entanglements forever, tying up these resources uselessly. These are sometimes called memory leaks because the available memory is reduced each time a new file is opened without closing a previously used file.

We'd like to isolate each entanglement so that we can be sure that the resource is acquired and released properly. The idea is to create a context in which our script uses an external resource. At the end of the context, our program is no longer bound to the resource and we want to be guaranteed that the resource is released.

Getting ready

Let's say we want to write lines of data to a file in CSV format. When we're done, we want to be sure that the file is closed and the various OS resources—including buffers and file handles—are released. We can do this in a context manager, which guarantees that the file will be properly closed.

Since we'll be working with CSV files, we can use the csv module to handle the details of the formatting:

>>> import csv

We'll also use the pathlib module to locate the files we'll be working with:

>>> from pathlib import Path

For the purposes of having something to write, we'll use this silly data source:

>>> some_source = [[2,3,5], [7,11,13], [17,19,23]]

This will give us a context in which to learn about the with statement.

How to do it...

  1. Create the context by opening the path, or creating the network connection with urllib.request.urlopen(). Other common contexts include archives like zip files and tar files:
    >>> target_path = Path.cwd()/"data"/"test.csv"
    >>> with target_path.open('w', newline='') as target_file:
    
  2. Include all the processing, indented within the with statement:
    >>> target_path = Path.cwd()/"data"/"test.csv"
    >>> with target_path.open('w', newline='') as target_file:
    ...     writer = csv.writer(target_file)
    ...     writer.writerow(['column', 'data', 'heading'])
    ...     writer.writerows(some_source)
    
  3. When we use a file as a context manager, the file is automatically closed at the end of the indented context block. Even if an exception is raised, the file is still closed properly. Outdent the processing that is done after the context is finished and the resources are released:
    >>> target_path = Path.cwd()/"data"/"test.csv"
    >>> with target_path.open('w', newline='') as target_file:
    ...     writer = csv.writer(target_file)
    ...     writer.writerow(['column', 'data', 'heading'])
    ...     writer.writerows(some_source)
    >>> print(f'finished writing {target_path.name}')
    

The statements outside the with context will be executed after the context is closed. The named resource—the file opened by target_path.open()—will be properly closed.

Even if an exception is raised inside the with statement, the file is still properly closed. The context manager is notified of the exception. It can close the file and allow the exception to propagate.

How it works...

A context manager is notified of three significant events surrounding the indented block of code:

  • Entry
  • Normal exit with no exception
  • Exit with an exception pending

The context manager will—under all conditions—disentangle our program from external resources. Files can be closed. Network connections can be dropped. Database transactions can be committed or rolled back. Locks can be released.

We can experiment with this by including a manual exception inside the with statement. This can show that the file was properly closed:

>>> try:
...     with target_path.open('w', newline='') as target_file:
...         writer = csv.writer(target_file)
...         writer.writerow(['column', 'data', 'heading'])
...         writer.writerow(some_source[0])
...         raise Exception("Testing")
... except Exception as exc:
...     print(f"{target_file.closed=}")
...     print(f"{exc=}")
>>> print(f"Finished Writing {target_path.name}")

In this example, we've wrapped the real work in a try statement. This allows us to raise an exception after writing the first line of data to the CSV file. Because the exception handling is outside the with context, the file is closed properly. All resources are released and the part that was written is properly accessible and usable by other programs.

The output confirms the expected file state:

target_file.closed=True
exc=Exception('Testing')

This shows us that the file was properly closed. It also shows us the message associated with the exception to confirm that it was the exception we raised manually. This kind of technique allows us to work with expensive resources like database connections and network connections and be sure these don't "leak." A resource leak is a common description used when resources are not released properly back to the OS; it's as if they slowly drain away, and the application stops working because there are no more available OS network sockets or file handles. The with statement can be used to properly disentangle our Python application from OS resources.

There's more...

Python offers us a number of context managers. We noted that an open file is a context, as is an open network connection created by urllib.request.urlopen().

For all file operations, and all network connections, we should always use a with statement as a context manager. It's very difficult to find an exception to this rule.

It turns out that the decimal module makes use of a context manager to allow localized changes to the way decimal arithmetic is performed. We can use the decimal.localcontext() function as a context manager to change rounding rules or precision for calculations isolated by a with statement.

We can define our own context managers, also. The contextlib module contains functions and decorators that can help us create context managers around resources that don't explicitly offer them.

When working with locks, the with statement context manager is the ideal way to acquire and release a lock. See https://docs.python.org/3/library/threading.html#with-locks for the relationship between a lock object created by the threading module and a context manager.

See also

  • See https://www.python.org/dev/peps/pep-0343/ for the origins of the with statement.
  • Numerous recipes in Chapter 9, Functional Programming Features, will make use of this technique. The recipes Reading delimited files with the cvs module, Reading complex formats using regular expressions, and Reading HTML documents, among others, will make use of the with statement.