python - numpy.ndarray objects not garbage collected -


while trying fine-tune memory leaks in python bindings c/c++ functions cam across strange behavior pertaining garbage collection of numpy arrays.

i have created couple of simplified cases in order better explain behavior. code run using memory_profiler, output follows after. appears python's garbage collection not working expected when comes numpy arrays:

# file deallocate_ndarray.py @profile def ndarray_deletion():     import numpy np     gc import collect     buf = 'abcdefghijklmnopqrstuvwxyz' * 10000     arr = np.frombuffer(buf)     del arr     del buf     collect()     y = [i**2 in xrange(10000)]     del y     collect()  if __name__=='__main__':     ndarray_deletion() 

with following command invoked memory_profiler:

python -m memory_profiler deallocate_ndarray.py

this got:

filename: deallocate_ndarray.py line #    mem usage    increment   line contents ================================================  5   10.379 mib    0.000 mib   @profile  6                             def ndarray_deletion():  7   17.746 mib    7.367 mib       import numpy np  8   17.746 mib    0.000 mib       gc import collect  9   17.996 mib    0.250 mib       buf = 'abcdefghijklmnopqrstuvwxyz' * 10000 10   18.004 mib    0.008 mib       arr = np.frombuffer(buf) 11   18.004 mib    0.000 mib       del arr 12   18.004 mib    0.000 mib       del buf 13   18.004 mib    0.000 mib       collect() 14   18.359 mib    0.355 mib       y = [i**2 in xrange(10000)] 15   18.359 mib    0.000 mib       del y 16   18.359 mib    0.000 mib       collect() 

i don't understand why forced calls collect don't reduce memory usage of program freeing memory. moreover, if numpy arrays don't behave due underlying c constructs, why doesn't list (which pure python) garbage collected?

i know del not directly call underlying __del__ method, note del statements in code end reducing reference count of corresponding objects 0 (thereby making them eligible garbage collection afaik). typically, expect see negative entry in increment column when object undergoes garbage collection. can shed light on going on here?

note: test run on os x 10.10.4, python 2.7.10 (conda), numpy 1.9.2 (conda), memory profiler 0.33 (conda-binstar), psutil 2.2.1 (conda).

in order see memory garbage collected, had increase size of buf several orders of magnitude. maybe size small memory_profiler detect change (it queries os, measurements not precise) or maybe small python garbage collector care, don't know.

for example, replacing 10000 100000000 in factor buf yields

line #    mem usage    increment   line contents ================================================ 21   10.289 mib    0.000 mib   @profile 22                             def ndarray_deletion(): 23   17.309 mib    7.020 mib       import numpy np 24   17.309 mib    0.000 mib       gc import collect 25 2496.863 mib 2479.555 mib       buf = 'abcdefghijklmnopqrstuvwxyz' * 100000000 26 2496.867 mib    0.004 mib       arr = np.frombuffer(buf) 27 2496.867 mib    0.000 mib       del arr 28   17.312 mib -2479.555 mib       del buf 29   17.312 mib    0.000 mib       collect() 30   17.719 mib    0.406 mib       y = [i**2 in xrange(10000)] 31   17.719 mib    0.000 mib       del y 32   17.719 mib    0.000 mib       collect() 

Comments