Python static code analysis and linting

Over the past years, I have been using various tools that helped me to write better Python code and catch common errors before committing the code. Linter is a piece of software that can help with that and there are a few Python linters that are capable of finding and reporting issues with your code. I would like to split the types of issues a linter can report into three groups:

Obvious code errors that would cause runtime errors.

Those are easy ones. To mention a few:

  • you forgot to declare a variable before it is used;
  • you supplied wrong number of arguments to a function;
  • you try to access a non-existing class property or method.

Linters help you catch those errors so it is great to run the linter on your Python modules before executing them. You would need to modify your code manually. PyLint or flake8 could be used.

Style related issues that do not cause runtime errors.

Those are easy ones, too. To mention a few:

  • a code line is too large making it difficult to read;
  • a docstring has single quotes (instead of double quotes);
  • you have two statements on the same line of code (separated with semicolon ;);
  • you have too many spaces around certain operators such as assignment.

Linters can also help you catch those issues so it is great to run the linter on your Python modules before executing them. They are less critical as you won’t get any runtime errors due to those issues found. Fixing those issues, however, will make your code more consistent and easier to work with.

Making such changes as separating function arguments with a space or breaking a longer line manually could be tedious. It becomes even more difficult if you are working with a legacy Python module that was written without any style guidelines. You could need to reformat every single line of the code which is very impractical.

Fortunately, there are a couple of Python code formatters (such as autopep8 and yapf) that could reformat Python module in place meaning you don’t have do it manually. Formatters depend on the configuration that would define how certain issues should be handled, for instance, the maximum length line or whether all function arguments should be supplied each on a separate line. The configuration files is read every time formatter runs and makes it possible to use the same code style which is of utter importance when you are a team of Python developers.

The general style guidelines can be found at PEP-8 which is the de-facto standard for code formatting used by Python community. However, if PEP-8 suggestions don’t work for your team, you can tweak it; the important thing is that everyone agrees and sticks to the standard you decide to use.

Code quality, maintainability, and compliance to best practices

Those are more complex mainly because it is way harder to identify less obvious issues in the code. To mention a few:

  • a class has too many methods and properties;
  • there are two many nesting if and else statements;
  • a function is too complex and should be split into multiple ones.

There are just a few linters that can give some hints on those. You would obviously need to modify your code manually.

As I used linters more and more, I’ve been exposed to various issues I have not thought of earlier. At that point of time I realized that linters I used could not catch all kinds of issues that could exist leaving some of them for me for debugging. So, I’ve started searching for other Python linters and Python rulesets that I could learn from to write better code.

There are so many Python linters and programs that could help you with static analysis – a dedicated list is maintained under awesome-static-analysis repository.

I would like to share a couple of helpful tools not listed there that could be of great help. The obvious errors and stylistic issues are less interesting because there are so many Python linters that would report those such as pylint and flake8. However, for more complex programs, to be able to get some insights on possible refactoring can often be more important. Knowing the best practices of the language and idioms can also make your Python code easier to understand and maintain. There are even some companies that develop products that check the quality of your code and report any possible issues. Reading through their rulesets is also very helpful.

wemake-python-styleguide is a flake8 plugin that aggregates many other flake8 plugins reporting a huge number of issues of all three categories we have discussed above. Custom rules (not reported by any flake8 plugin) can be found at the docs page.

SonarSource linter is available in multiple IDEs and can report all kinds of bugs, code smells, and pieces of code that are too complex and should be refactored. Make sure to read through the ruleset, it is a great one.

Semmle ruleset is an open-source product (as part of LGTM.com), and their ruleset is very helpful and should be reviewed.

SourceMeter ruleset is not an open-source product (however, there is a free version) but their ruleset is also very helpful and should be reviewed.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s