Python garbage collection can be that slow?


Keywords:python 


Question: 

I have a problem with my python application, and I think it's related to the python garbage collection, even if I'm not sure...

The problem is that my application takes a lot of time to exit and to switch to one function to the next one.

In my application I handle very large dictionaries, containing thousands of large objects which are instantiated from wrapped C++ classes.

I put some timestamp outputs in my program, and I saw that at the end of each function, when objects created inside the function should go out of scope, a lot of time is spent by the interpreter before calling the next function. And I observe the same problem at the end of the application, when the program should exit: a lot of time (~ hours!) is spent between the last timestamp on screen and the appearance of the fresh prompt.

The memory usage is stable, so I don't really have memory leaks.

Any suggestions?

Can be the garbage collection of thousands of large C++ objects that slow?

Is there a method to speed up that?

UPDATE:

Thanks a lot for all your answers, you gave me a lot of hints to debug my code :-)

I use Python 2.6.5 on Scientific Linux 5, a customized distribution based on Red Hat Enterprise 5. And actually I'm not using SWIG to get Python bindings for our C++ code, but the Reflex/PyROOT framework. I know, it's not very known outside particle physics (but still open source and freely available) and I have to use it because it's the default for our main framework.

And in this context the DEL command from the Python side does not work, I had already tried it. DEL only deletes the python variable linked to the C++ object, not the object itself in memory, which is still owned by the C++ side...

...I know, it's not-standard I guess, and a bit complicated, sorry :-P

But following your hints, I'll profile my code and I'll come back to you with more details, as you suggested.

ADDITIONAL UPDATE:

Ok, following your suggestions, I instrumented my code with cProfile, and I discovered that actually the gc.collect() function is the function taking the most of the running time!!

Here the output from cProfile + pstats print_stats():


    >>> p.sort_stats("time").print_stats(20)
Wed Oct 20 17:46:02 2010    mainProgram.profile

         547303 function calls (542629 primitive calls) in 548.060 CPU seconds

   Ordered by: internal time
   List reduced from 727 to 20 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        4  345.701   86.425  345.704   86.426 {gc.collect}
        1  167.115  167.115  200.946  200.946 PlotD3PD_v3.2.py:2041(PlotSamplesBranches)
       28   12.817    0.458   13.345    0.477 PlotROOTUtils.py:205(SaveItems)
     9900   10.425    0.001   10.426    0.001 PlotD3PD_v3.2.py:1973(HistoStyle)
     6622    5.188    0.001    5.278    0.001 PlotROOTUtils.py:403(__init__)
       57    0.625    0.011    0.625    0.011 {built-in method load}
      103    0.625    0.006    0.792    0.008 dbutils.py:41(DeadlockWrap)
       14    0.475    0.034    0.475    0.034 {method 'dump' of 'cPickle.Pickler' objects}
     6622    0.453    0.000    5.908    0.001 PlotROOTUtils.py:421(CreateCanvas)
    26455    0.434    0.000    0.508    0.000 /opt/root/lib/ROOT.py:215(__getattr__)
[...]

>>> p.sort_stats("cumulative").print_stats(20)
Wed Oct 20 17:46:02 2010    mainProgram.profile

         547303 function calls (542629 primitive calls) in 548.060 CPU seconds

   Ordered by: cumulative time
   List reduced from 727 to 20 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001  548.068  548.068 PlotD3PD_v3.2.py:2492(main)
        4    0.000    0.000  346.756   86.689 /usr/lib//lib/python2.5/site-packages/guppy/heapy/Use.py:171(heap)
        4    0.005    0.001  346.752   86.688 /usr/lib//lib/python2.5/site-packages/guppy/heapy/View.py:344(heap)
        1    0.002    0.002  346.147  346.147 PlotD3PD_v3.2.py:2537(LogAndFinalize)
        4  345.701   86.425  345.704   86.426 {gc.collect}
        1  167.115  167.115  200.946  200.946 PlotD3PD_v3.2.py:2041(PlotBranches)
       28   12.817    0.458   13.345    0.477 PlotROOTUtils.py:205(SaveItems)
     9900   10.425    0.001   10.426    0.001 PlotD3PD_v3.2.py:1973(HistoStyle)
    13202    0.336    0.000    6.818    0.001 PlotROOTUtils.py:431(PlottingCanvases)
     6622    0.453    0.000    5.908    0.001 /root/svn_co/rbianchi/SoftwareDevelopment

[...]

>>>

So, in both outputs, sorted by "time" and by "cumulative" time respectively, gc.collect() is the function consuming the most of the running time of my program! :-P

And this is the output of the memory profiler Heapy, just before returning the main() program.

memory usage before return:
Partition of a set of 65901 objects. Total size = 4765572 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25437  39  1452444  30   1452444  30 str
     1   6622  10   900592  19   2353036  49 dict of PlotROOTUtils.Canvas
     2    109   0   567016  12   2920052  61 dict of module
     3   7312  11   280644   6   3200696  67 tuple
     4   6622  10   238392   5   3439088  72 0xa4ab74c
     5   6622  10   185416   4   3624504  76 PlotROOTUtils.Canvas
     6   2024   3   137632   3   3762136  79 types.CodeType
     7    263   0   129080   3   3891216  82 dict (no owner)
     8    254   0   119024   2   4010240  84 dict of type
     9    254   0   109728   2   4119968  86 type
  Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    10   1917   3   107352   2   4264012  88 function
    11   3647   5   102116   2   4366128  90 ROOT.MethodProxy
    12    148   0    80800   2   4446928  92 dict of class
    13   1109   2    39924   1   4486852  93 __builtin__.wrapper_descriptor
    14    239   0    23136   0   4509988  93 list
    15     87   0    22968   0   4532956  94 dict of guppy.etc.Glue.Interface
    16    644   1    20608   0   4553564  94 types.BuiltinFunctionType
    17    495   1    19800   0   4573364  94 __builtin__.weakref
    18     23   0    11960   0   4585324  95 dict of guppy.etc.Glue.Share
    19    367   1    11744   0   4597068  95 __builtin__.method_descriptor

Any idea why, or how to optimize the garbage collection?

Is there any more detailed check I can do?


3 Answers: 

This is known garbage collector issue in Python 2.6 causing quadratic time for garbage collection when many objects are being allocated without deallocating any of them ie. population of large list.
There are two simple solutions:

  1. either disable garbage collection before populating large lists and enable it afterwards

    l = []
    gc.disable()
    for x in xrange(10**6):
      l.append(x)
    gc.enable()
    
  2. or update to Python 2.7, where the issue has been solved

I prefer the second solution, but it's not always an option;)

 

Yes, it could be garbage collection, but it could also be some synchronisation with the C++ code, or something completely different (hard to say without code).

Anyway, you should have a look at SIG for development of Python/C++ integration to find issues and how to speed up things.

 

If your problem really is the garbage collection, try explicitly freeing your objects when you're done with them using del().

In general, this doesn't sound like a garbage collection problem, unless we're talking about terabytes of memory.

I agree with S.Lott... profile your app, then bring code snippets and the results of that back and we can be much more helpful.