考虑以下计划:
import pandas as pd
import datetime
import time
import psutil
import os
import gc
# Construct a trivial pandas time series
data = []
indexes = []
for _ in xrange(5):
data.append(_)
indexes.append(datetime.datetime.now())
time.sleep(1)
s = pd.Series(data, index=indexes)
for _ in xrange(100000):
# Remove the next line to prevent memory leak
foo = datetime.datetime.now() - s.index[-1]
# These lines are okay
foo_dt = datetime.datetime.now()
foo_idx = s.index[-1]
#gc.collect() # This mitigates but does not eliminate the problem
# Get memory per https://stackoverflow.com/a/21632554/939259
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
这给出了结果(如果包含gc.collect()
):
$ python ./test_leak.py | uniq
60502016
60547072
60755968
<snip>
没有gc.collect()
类似:
$ python ./test_leak.py | uniq
60518400
60588032
60776448
<snip>
这里发生了什么?当我正在做的是分配一个临时的时,为什么内存会增加?