If you have ever needed to run a rather large data processing script, you know that it may be rather difficult to track the progress of the script execution. If you copy a number of files from a directory into another one you can easily show the progress by figuring out the total size of all files and then print how much has already been copied or how much is left. However, if your program does many things and executes code from some 3rd party packages, there is a risk you won’t have a clue about how much time is left or at least where you are in the program, that is what line of code is currently being executed.
A simple solution to this is to spread the
Fortunately, there is a built-in module
trace available both in Python 2 and 3 which you can use to show the progress of your program execution. To learn more about the
trace module, take a look at the standard library docs page and on the Python MOTW page.
To provide a simple example, if you have a Python script
run.py containing the following:
import math import time print(math.pow(2, 3)) print(math.pow(2, 4)) print(math.pow(2, 5))
then you can run
python -m trace -t --ignore-dir C:\Python36 .\run.py to see the live updates on what line of your program is being executed. This means you can run your long time taking script in a terminal and then get back to it now and then to see its progress because it will print each line that is currently being executed. The handy
--ignore-dir option lets you filter out calls to the internal Python modules so your terminal won’t be polluted with unnecessary details.
On Windows, be aware of the bug in CPython which breaks because of how directories comparison works incorrect on case-insensitive file systems (such as NTFS on Windows). So be sure to specify the path to the Python interpreter directory using the right case (
C:\Python36 would work, but
c:\python36 would not).
You can also provide multiple directories to ignore, but be aware of what environment you run your Python script on Windows, because you would need to use different syntax.
- Git Bash:
$ python -m trace -t --ignore-dir 'C:/Python36;C:/Util' run.py
python -m trace -t --ignore-dir C:\Python36;C:\Util .\run.py
python -m trace -t --ignore-dir 'C:\Python36;C:\Util' .\run.py
In Linux, it seems like you don’t have to provide the
ignore-dir argument at all to filter out the system calls:
linuxuser@LinuxMachine:~/Development$ python -m trace -t run.py
— modulename: run, funcname: <module>
run.py(1): import math
run.py(2): import time
run.py(4): print(math.pow(2, 3))
run.py(5): print(math.pow(2, 4))
run.py(6): print(math.pow(2, 5))
— modulename: trace, funcname: _unsettrace
trace module also has other usages such as generating the code coverage and branching which can be useful if you would like to see what branch of your
if-else was picked during the program execution. However, you wouldn’t use the
trace module only to generate the code coverage, because there is Python package called
coverage.py that provides much richer functionality for this, so be sure to use that instead.