Question

我在获取tensorflow来测试带有pandas数据帧的多个DNNClassifiers时遇到问题。我收到的错误是ResourceExhaustedError ... Too many open files。我尝试将del和gc.collect()一起使用来获取tensorflow来关闭文件，但这并没有解决问题。上一个问题tf.estimator Error: ResourceExhausted: too many open files (TF keeps events.out.tfevents files open)的答案涉及编辑tensorflow本身以使其起作用，但我无法在当前环境中编辑tensorflow。导致错误的代码如下。

(df,featurecolumns) = create_df('r')
(testdf,testfeaturecolumns) = create_df('r9')
x = 1
y = 1
maxunits = 100
maxaccuracy = 0.0
bestunits = [0,0]

testbar = Bar("Testing models: ", max = maxunits*maxunits)
while x <= maxunits:
    y = 1
    while y <= maxunits:
        dnnclassifier = tf.estimator.DNNClassifier(feature_columns=featurecolumns, hidden_units=[x,y])
        dnnclassifier.train(input_fn=pd_input_fn(df,'flag'))
        dnnclassifierresults = dnnclassifier.evaluate(input_fn=pd_input_fn(testdf,'flag'))
        if dnnclassifierresults['accuracy'] > maxaccuracy:
            maxaccuracy = dnnclassifierresults['accuracy']
            bestunits = [x,y]
         y = y + 1
         del dnnclassifier
         del dnnclassifierresults
         gc.collect()
         testbar.next()
    x = x + 1
    testbar.next()
testbar.finish()
print("Best Parameters: " + str(bestunits) + " units with " + str(maxaccuracy*100) + "% accuracy.")

Answer 1

您可以使用multiprocessing并在新流程中运行逻辑，因此，当流程结束时，将释放所有相关资源。像这样：

import multiprocessing as mp

class TestNetworkProcess(mp.Process):
    def __init__(self, x, y):  # add other parameters
        super(mp.Process, self).__init__()  # don't forget
        self.x, self.y = x, y

    def run(self):
        # your code here, e.g.
        dnnclassifier = tf.estimator.DNNClassifier(hidden_units=[x,y])
        self.accuracy = dnnclassifierresults['accuracy']


# initialization code
best_accuracy, best_xy = 0.0, None
for x in range(1, 100):
    for y in range(1, 100):
        proc = TestNetworkProcess(x, y)
        proc.start()
        proc.join()

        if proc.accuracy > best_accuracy:
            best_accuracy, best_xy = proc.accuracy, (x, y)

修改代码以传递训练和测试数据等。Python进程使用写时复制，因此您可以在__init__中传递熊猫数据帧，而无需每次都重新加载它们。

Tensorflow打开文件但不关闭它们

1 个答案: