我有一个函数在专用的多进程进程中运行,该进程可以无限循环地运行,可以使用线程计时器重新安排自身的运行时间,也可以只运行一次。在第一种和第二种情况下,即使我只是在无穷循环模式下更新对变量的引用,内存使用也无休止地泄漏。我使用队列与主流程进行通信,并从主流程中弹出所有内容。
这是功能代码:
def schedule(args, fn, q = None ):
isendless = True
while isendless:
# retrieve from database
pd_original_matrice = get_rating(args);
# save userID / index correspondance
index = pd_original_matrice['CustomerId'].values
index = dict( zip(index, list(range(0, len(index))) ))
now = dt.now().replace(microsecond=0) # start timer
mf = MF(pd_original_matrice.iloc[:,1:].fillna(0).values.astype(np.float64)
, args.nbFeatures, args.learning_rate, args.regularization, args.nbEpoch)
training_rmse, testing_rmse = mf.train(args.prediction_method, args.factorization_method, split=args.split, reduc=args.reduc)
then = dt.now().replace(microsecond=0) # get elapsed time
log.normal('__MF has ended, duration %s , min test RMSE of %0.6f achieved at epoch %d'
%( str(then-now), min(testing_rmse), testing_rmse.index(min(testing_rmse))+1 ))
# schedule a new MF
if args.cycle != 0:
log.normal('___New MF schedule in %d seconds' %args.cycle)
t = timer(args.cycle, schedule, [args, fn, q]) # re-schedule
t.start()
print(Fore.WHITE)
if (q != None): # for endless or re-scheduled
q.put( (mf, pd_original_matrice, index) )
else: # For one-time mf
return mf, pd_original_matrice, index
if args.endless_mf != 1: # final do while test
isendless = False
该进程内部调用的函数占用的内存也应该是空闲的,我相信在重新调度模式下,该进程将在用threading.timer调度新线程后终止。你有什么想法 ?我以为垃圾收集器会删除没有更多引用的对象并释放内存。是否与对垃圾收集器行为的误解有关?
谢谢