Python Package Code Style, Format and Linters (original) (raw)

Take Aways

Consistent code format and style is useful to both your package and across the scientific Python ecosystem because using similar formats makes code easier to read.

For instance, if you saw a sentence like this one without any spaces, or punctuation, it would take your brain longer to process it.

forinstanceifyousawasentencelikethisonewithoutany...

pyOpenSci peer review process requires that you to follow standardPython PEP 8 format rules as closely as you can.

pyOpenSci doesn’t require you to use a specific code format tool. However, we do look for consistency and readability in code style. Below you will find a discussion of:

  1. The benefits of using linters and code format tools in your workflow
  2. Some commonly used tools in the scientific Python space
  3. Setting up pre-commit hooks and the pre-commit.ci bot to make using code format tools in daily workflows and in pull requests on GitHub easier.

Use a code format tool (or tools) to make your life easier#

We suggest that you use a code format tool, or a set of format tools, because manually applying all of the PEP 8 format specifications is both time consuming for maintainers and can be a road block for potential new contributors. Code formatters will automagically reformat your code for you, adhering to PEP 8 standards and applying consistent style decisions throughout.

Setting up a code format suite of tools will:

Many packages use a suite of tools to apply code format rules, taking the work out of manually implementing code format requirements.

Consistent code format across packages within the (scientific) Python ecosystem, will also broadly make code easier to scan, understand and contribute to.

Linting vs format and style#

Before we dive in let’s get a few definitions out of the way.

Code Linting#

A code linter is a tool that will review your code and identify errors or issues. A linter typically does not modify your code. It will tell you what the error is and on what line it was discovered. Flake8, discussed below, is an example of a commonly-used code linter.

Code Formatters (and stylers)#

Code formatters will reformat your code for you. Python focused code formatters often follow PEP 8 standards. However, they also make stylistic decisions about code consistency.

Black is an example of a commonly-used code formatter. Black both applies PEP 8 standards while also making decisions about things like consistent use of double quotes for strings, and spacing of items in lists.

You will learn more about Black below.

Code linting, formatting and styling tools#

Black#

Black is a code formatter. Black will automagically (and unapologetically) fix spacing issues and ensure code format is consistent throughout your package. Black also generally adheres to PEP 8 style guidelines with some exceptions. A few examples of those exceptions are below:

Tip

If you are interested in seeing how Black will format your code, you can use the Black playground

Using a code formatter like Black will leave you more time to work on code function rather than worry about format.

Flake8#

To adhere to Python pep8 format standards, you might want to addflake8 to your code format toolbox.

flake8 will:

Flake8 also flags unused imports and unused declared variables in your modules.

Below you can see the output of runningflake8 filename.py at the command line for a Python file within a package called stravalib.

The line length standard for PEP 8 is 79 characters.

Notice that flake8 returns a list of issues that it found in the model.py module on the command line. The Python file itself is not modified. Using this output, you can fix each issue line by line manually.

(stravalib-dev) username@computer stravalib % flake8 stravalib/model.py stravalib/model.py:8:1: F401 'os' imported but unused stravalib/model.py:29:80: E501 line too long (90 > 79 characters) stravalib/model.py:34:80: E501 line too long (95 > 79 characters) stravalib/model.py:442:80: E501 line too long (82 > 79 characters) stravalib/model.py:443:39: E231 missing whitespace after ',' stravalib/model.py:493:20: E225 missing whitespace around operator stravalib/model.py:496:80: E501 line too long (82 > 79 characters)

Isort#

Python imports refer to the Python packages that a module in your package requires. Imports should always be located at the top of each Python module in your package.

PEP 8 has specific standards for the order of these imports. These standards are listed below:

Imports should be grouped in the following order:

While flake8 will identify unused imports in your code, it won’t fix or identify issues with the order of package imports.

isort will identify where imports in your code are out of order. It will then modify your code, automatically reordering all imports. This leaves you with one less thing to think about when cleaning up your code.

Example application of isort#

Code imports before isort is run:

Below, the pandas is a third party package, typing is a core Pythonpackage distributed with Python, and examplePy.temperature is a first-party module which means it belongs to the same package as the file doing the import. Also notice that there are no spaces in the imports listed below.

from examplePy.temperature import fahrenheit_to_celsius import pandas from typing import Sequence

From the project root, run:

isort src/examplePy/temporal.py

Python file temporal.py imports after isort has been run

from typing import Sequence

import pandas

from examplePy.temperature import fahrenheit_to_celsius

Ruff#

Ruff is a new addition to the code quality ecosystem, gaining some traction since its release. ruff is both a linter and a code formatter for Python, aiming to replace several tools behind a single interface. As such, ruff can be used at a replacement of all other tools mentioned here, or in complement to some of them.

ruff has some interesting features that distinguish it from other linters:

Here is a simple configuration to get started with ruff. It would go into your pyproject.toml:

[tool.ruff] select = [ "E", # pycodestyle errors "W", # pycodestyle warnings "F", # pyflakes. "E" + "W" + "F" + "C90" (mccabe complexity) is equivalent to flake8 "I", # isort ]

Depending on your project, you might want to add the following to sort imports correctly:

[tool.ruff.isort] known-first-party = ["examplePy"]

How to use code formatter in your local workflow#

Linters, code formatters and your favorite coding tools#

Linters can be run as a command-line tool as shown above. They also can be run within your favorite coding tool (e.g. VScode, pycharm, etc). For example, you might prefer to have tools like Black and isort run when you save a file. In some editors you can also setup shortcuts that run your favorite code format tools on demand.

Use pre-commit hooks to run code formatters and linters on commits#

You can also setup a pre-commit hook in your Python package repository.

A pre-commit hook is a tool that allows an action (or actions) to be triggered when you apply a commit to your git repository.

Pre-commit hook example workflow#

The precommit workflow looks like this: You type and run:

git commit -m "message here" at the command line

Diagram showing the steps of a pre-commit workflow from left to right.

The pre-commit workflow begins with you adding files that have changes to be staged in git. Next, you’d run git commit. When you run git commit, the pre-commit hooks will then run. In this example, Black, the code formatter and flake8, a linter both run. If all of the files pass Black and flake8 checks, then your commit will be recorded. If they don’t, the commit is canceled. You will have to fix any flake8 issues, and then re-add / stage the files to be committed. Image Source#

Important

If have a Python code-base and multiple maintainers actively working on the code, and you intend to run a tool like Black, be sure to coordinate across your team. An initial commit that applies Black to your entire package will likely change a significant amount of your code. This could lead to merge conflicts on open and new PR’s before the new changes are merged.

General pre commit checks#

In addition to calling tools, Pre-commit also has a suite of built in format hooks that you can call. Some, such as trailing-whitespace can be also useful to add to your pre-commit workflow to ensure clean, streamlined code files.

An example pre-commit-config.yaml file is below with examples of how this is all setup.

Pre-commit.ci#

Pre-commit.ci is a bot that may become your new best friend. This bot, when setup on a repo can be configured to do the following:

The pre-commit.ci bot uses the same pre-commit-config.yaml file that you use to setup pre-commit locally.

Setting up a bot like this can be valuable because:

Setting up a git pre-commit hook#

To setup pre-commit locally, you need to do 3 things:

  1. Install pre-commit (and include it as a development requirement in your repository)

python -m pip install pre-commit

or

conda install -c conda-forge pre-commit

  1. Create a .pre-commit-config.yaml file in the root of your package directory.

Below is an example .pre-commit-cofig.yaml file that can be used to setup the pre-commit hook and the pre-commit.ci bot if you chose to implement that too.

repos:

Misc commit checks using built in pre-commit checks

Linting: Python code (see the file .flake8)

Black for auto code formatting

Tell precommit.ci bot to update codoe format tools listed in the file

versions every quarter

The default it so update weekly which is too many new pr's for many

maintainers (remove these lines if you aren't using the bot!)

ci: autoupdate_schedule: quarterly

This file specifies a hook that will be triggered automatically before each git commit, in this case, it specifies a flake8 using version 6.0.0.

  1. Install your pre-commit hook(s) using pre-commit install. This will install all of the hooks specified in the pre-commit yaml file into your environment.

Once you have done the above, you are ready to start working on your code. Pre-commit will run every time you run git commit.

Summary#

pyOpenSci suggests setting up a linter and a code styler for your package, regardless of whether you use pre-commit hooks, CI or other infrastructure to manage code format. Setting up these tools will give you automatic feedback about your code’s structure as you (or a contributor) write it. And using a tool like black that format code for you, reduce effort that you need to make surrounding decisions around code format and style.