Question

我有一个函数在专用的多进程进程中运行，该进程可以无限循环地运行，可以使用线程计时器重新安排自身的运行时间，也可以只运行一次。在第一种和第二种情况下，即使我只是在无穷循环模式下更新对变量的引用，内存使用也无休止地泄漏。我使用队列与主流程进行通信，并从主流程中弹出所有内容。

这是功能代码：

def schedule(args, fn, q = None ):

    isendless = True
    while isendless:

        # retrieve from database
        pd_original_matrice = get_rating(args);

        # save userID / index correspondance
        index = pd_original_matrice['CustomerId'].values
        index = dict( zip(index, list(range(0, len(index))) ))

        now = dt.now().replace(microsecond=0) # start timer

        mf = MF(pd_original_matrice.iloc[:,1:].fillna(0).values.astype(np.float64)
        , args.nbFeatures, args.learning_rate, args.regularization, args.nbEpoch)

        training_rmse, testing_rmse = mf.train(args.prediction_method, args.factorization_method, split=args.split, reduc=args.reduc)

        then = dt.now().replace(microsecond=0) # get elapsed time
        log.normal('__MF has ended, duration %s , min test RMSE of %0.6f achieved at epoch %d'
                    %( str(then-now), min(testing_rmse), testing_rmse.index(min(testing_rmse))+1 ))

        # schedule a new MF
        if args.cycle != 0:
            log.normal('___New MF schedule in %d seconds' %args.cycle)
            t = timer(args.cycle, schedule, [args, fn, q]) # re-schedule
            t.start()
            print(Fore.WHITE)

        if (q != None): # for endless or re-scheduled
            q.put( (mf, pd_original_matrice, index) )

        else: # For one-time mf
            return mf, pd_original_matrice, index

        if args.endless_mf != 1: # final do while test
            isendless = False

该进程内部调用的函数占用的内存也应该是空闲的，我相信在重新调度模式下，该进程将在用threading.timer调度新线程后终止。你有什么想法？我以为垃圾收集器会删除没有更多引用的对象并释放内存。是否与对垃圾收集器行为的误解有关？

谢谢

Python进程循环或重新计划的内存泄漏

0 个答案: