Modern Python Cookbook
上QQ阅读APP看书,第一时间看更新

Writing better RST markup in docstrings

When we have a useful script, we often need to leave notes on what it does, how it works, and when it should be used. Many tools for producing documentation, including docutils, work with RST markup. What RST features can we use to make documentation more readable?

Getting ready

In the Including descriptions and documentation recipe, we looked at putting a basic set of documentation into a module. This is the starting point for writing our documentation. There are a large number of RST formatting rules. We'll look at a few that are important for creating readable documentation.

How to do it...

  1. Be sure to write an outline of the key points. This may lead to creating RST section titles to organize the material. A section title is a two-line paragraph with the title followed by an underline using =, -, ^, ~, or one of the other docutils characters for underlining.

    A heading will look like this:

            Topic 
            ===== 
    

    The heading text is on one line and the underlining characters are on the next line. This must be surrounded by blank lines. There can be more underline characters than title characters, but not fewer.

    The RST tools will infer our pattern of using underlining characters. As long as the underline characters are used consistently, the algorithm for matching underline characters to the desired heading will detect the pattern. The keys to this are consistency and a clear understanding of sections and subsections.

    When starting out, it can help to make an explicit reminder sticky note like this:

    Example of heading characters

  2. Fill in the various paragraphs. Separate paragraphs (including the section titles) with blank lines. Extra blank lines don't hurt. Omitting blank lines will lead the RST parsers to see a single, long paragraph, which may not be what we intended.

    We can use inline markup for emphasis, strong emphasis, code, hyperlinks, and inline math, among other things. If we're planning on using Sphinx, then we have an even larger collection of text roles that we can use. We'll look at these techniques soon.

  3. If the programming editor has a spell checker, use that. This can be frustrating because we'll often have code samples that may include abbreviations that fail spell checking.

How it works...

The docutils conversion programs will examine the document, looking for sections and body elements. A section is identified by a title. The underlines are used to organize the sections into a properly nested hierarchy. The algorithm for deducing this is relatively simple and has these rules:

  • If the underline character has been seen before, the level is known
  • If the underline character has not been seen before, then it must be indented one level below the previous outline level
  • If there is no previous level, this is level one

A properly nested document might have the following sequence of underline characters:

    TITLE
    =====
    SOMETHING
    ---------
    MORE    
    ^^^^
   
    EXTRA
    ^^^^^
    LEVEL 2
    -------
    LEVEL 3
    ^^^^^^^

We can see that the first title underline character, =, will be level one. The next, -, is unknown but appears after a level one, so it must be level two. The third headline has ^, which is previously unknown, is inside level two, and therefore must be level three. The next ^ is still level three. The next two, - and ^, are known to be level two and three respectively.

From this overview, we can see that inconsistency will lead to confusion.

If we change our mind partway through a document, this algorithm can't detect that. If—for inexplicable reasons—we decide to skip over a level and try to have a level four heading inside a level two section, that simply can't be done.

There are several different kinds of body elements that the RST parser can recognize. We've shown a few. The more complete list includes:

  • Paragraphs of text: These might use inline markup for different kinds of emphasis or highlighting.
  • Literal blocks: These are introduced with :: and indented four spaces. They may also be introduced with the .. parsed-literal:: directive. A doctest block is indented four spaces and includes the Python >>> prompt.
  • Lists, tables, and block quotes: We'll look at these later. These can contain other body elements.
  • Footnotes: These are special paragraphs. When rendered, they may be put on the bottom of a page or at the end of a section. These can also contain other body elements.
  • Hyperlink targets, substitution definitions, and RST comments: These are specialized text items.

There's more...

For completeness, we'll note here that RST paragraphs are separated by blank lines. There's quite a bit more to RST than this core rule.

In the Including descriptions and documentation recipe, we looked at several different kinds of body elements we might use:

  • Paragraphs of Text: This is a block of text surrounded by blank lines. Within these, we can make use of inline markup to emphasize words, or to use a font to show that we're referring to elements of our code. We'll look at inline markup in the Using inline markup recipe.
  • Lists: These are paragraphs that begin with something that looks like a number or a bullet. For bullets, use a simple or *. Other characters can be used, but these are common. We might have paragraphs like this.

    It helps to have bullets because:

  • They can help clarify
  • They can help organize
  • Numbered Lists: There are a variety of patterns that are recognized. We might use a pattern like one of the four most common kinds of numbered paragraphs:
    1. Numbers followed by punctuation like . or ).
    2. A letter followed by punctuation like . or ).
    3. A Roman numeral followed by punctuation.
    4. A special case of # with the same punctuation used on the previous items. This continues the numbering from the previous paragraphs.
  • Literal Blocks: A code sample must be presented literally. The text for this must be indented. We also need to prefix the code with ::. The :: character must either be a separate paragraph or the end of a lead-in to the code example.
  • Directives: A directive is a paragraph that generally looks like .. directive::. It may have some content that's indented so that it's contained within the directive. It might look like this:
       ..  important::
            Do not flip the bozo bit.

The .. important:: paragraph is the directive. This is followed by a short paragraph of text indented within the directive. In this case, it creates a separate paragraph that includes the admonition of important.

Using directives

Docutils has many built-in directives. Sphinx adds a large number of directives with a variety of features.

Some of the most commonly used directives are the admonition directives: attention, caution, danger, error, hint, important, note, tip, warning, and the generic admonition. These are compound body elements because they can have multiple paragraphs and nested directives within them.

We might have things like this to provide appropriate emphasis:

    ..  note:: Note Title
        We need to indent the content of an admonition.
        This will set the text off from other material.

One of the other common directives is the parsed-literal directive:

    ..  parsed-literal::
        any text
            *almost* any format
        the text is preserved
            but **inline** markup can be used.

This can be handy for providing examples of code where some portion of the code is highlighted. A literal like this is a simple body element, which can only have text inside. It can't have lists or other nested structures.

Using inline markup

Within a paragraph, we have several inline markup techniques we can use:

  • We can surround a word or phrase with * for *emphasis*. This is commonly typeset as italic.
  • We can surround a word or phrase with ** for **strong**. This is commonly typeset as bold.
  • We surround references with single back-ticks (`, it's on the same key as the ~ on most keyboards). Links are followed by an underscore, "_". We might use `section title`_ to refer to a specific section within a document. We don't generally need to put anything around URLs. The docutils tools recognize these. Sometimes we want a word or phrase to be shown and the URL concealed. We can use this: `the Sphinx documentation <http://www.sphinx-doc.org/en/stable/>`_.
  • We can surround code-related words with a double back-tick (``) to make them look like ``code``.

There's also a more general technique called a text role. A role is a little more complex-looking than simply wrapping a word or phrase in * characters. We use :word: as the role name followed by the applicable word or phrase in single ` back-ticks. A text role looks like this :strong:`this`.

There are a number of standard role names, including :emphasis:, :literal:, :code:, :math:, :pep-reference:, :rfc-reference:, :strong:, :subscript:, :superscript:, and :title-reference:. Some of these are also available with simpler markup like *emphasis* or **strong**. The rest are only available as explicit roles.

Also, we can define new roles with a simple directive. If we want to do very sophisticated processing, we can provide docutils with class definitions for handling roles, allowing us to tweak the way our document is processed. Sphinx adds a large number of roles to support detailed cross-references among functions, methods, exceptions, classes, and modules.

See also