5.10 Profiling

Section Author: Martin Albrecht (malb@informatik.uni-bremen.de)
``Premature optimization is the root of all evil.'' - Donald Knuth

Sometimes it is useful to check for bottlenecks in code to understand which parts take the most computational time; this can give a good idea of which parts to optimize. Python and therefore Sage offers several profiling--as this process is called--options.

The simplest to use is the prun command in the interactive shell. It returns a summary describing which functions took how much computational time. To profile (the currently slow! - as of version 1.0) matrix multiplication over finite fields, for example, do:

sage: k,a = GF(2**8, 'a').objgen()
sage: A = Matrix(k,10,10,[k.random_element() for _ in range(10*10)])

sage: prun B = A*A
       32893 function calls in 1.100 CPU seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)
 12127  0.160   0.000   0.160  0.000 :0(isinstance)
  2000  0.150   0.000   0.280  0.000 matrix.py:2235(__getitem__)
  1000  0.120   0.000   0.370  0.000 finite_field_element.py:392(__mul__)
  1903  0.120   0.000   0.200  0.000 finite_field_element.py:47(__init__)
  1900  0.090   0.000   0.220  0.000 finite_field_element.py:376(__compat)
   900  0.080   0.000   0.260  0.000 finite_field_element.py:380(__add__)
     1  0.070   0.070   1.100  1.100 matrix.py:864(__mul__)
  2105  0.070   0.000   0.070  0.000 matrix.py:282(ncols)
  ...

Here ncalls is the number of calls, tottime is the total time spent in the given function (and excluding time made in calls to sub-functions), percall is the quotient of tottime divided by ncalls. cumtime is the total time spent in this and all sub-functions (i.e., from invocation until exit), percall is the quotient of cumtime divided by primitive calls, and filename:lineno(function) provides the respective data of each function. The rule of thumb here is: The higher the function in that listing the more expensive it is. Thus it is more interesting for optimization.

As usual, prun? provides details on how to use the profiler and understand the output.

The profiling data may be written to an object as well to allow closer examination:

sage: prun -r A*A
sage: stats = _
sage: ?stats

Note: entering stats = prun -r A*A displays a syntax error message because prun is an IPython shell command, not a regular function.

For a nice graphical representation of profiling data you can use the hotshot profiler, a small script called hotshot2cachetree and the program kcachegrind (Unix only). The same example with the hotshot profiler:

sage: k,a = GF(2**8, 'a').objgen()
sage: A = Matrix(k,10,10,[k.random_element() for _ in range(10*10)])
sage: import hotshot
sage: filename = "pythongrind.prof"
sage: prof = hotshot.Profile(filename, lineevents=1)
sage: prof.run("A*A")
<hotshot.Profile instance at 0x414c11ec>
sage: prof.close()

This results in a file pythongrind.prof in the current working directory. It can now be converted to the cachegrind format for visualization.

On a system shell type

hotshot2calltree -o cachegrind.out.42 pythongrind.prof

The output file cachegrind.out.42 can now be examined with kcachegrind. Please note that the naming convention cachegrind.out.XX needs to be obeyed.

See About this document... for information on suggesting changes.