我在获取tensorflow来测试带有pandas数据帧的多个DNNClassifiers时遇到问题。我收到的错误是ResourceExhaustedError ... Too many open files
。我尝试将del
和gc.collect()
一起使用来获取tensorflow来关闭文件,但这并没有解决问题。上一个问题tf.estimator Error: ResourceExhausted: too many open files (TF keeps events.out.tfevents files open)的答案涉及编辑tensorflow本身以使其起作用,但我无法在当前环境中编辑tensorflow。导致错误的代码如下。
(df,featurecolumns) = create_df('r')
(testdf,testfeaturecolumns) = create_df('r9')
x = 1
y = 1
maxunits = 100
maxaccuracy = 0.0
bestunits = [0,0]
testbar = Bar("Testing models: ", max = maxunits*maxunits)
while x <= maxunits:
y = 1
while y <= maxunits:
dnnclassifier = tf.estimator.DNNClassifier(feature_columns=featurecolumns, hidden_units=[x,y])
dnnclassifier.train(input_fn=pd_input_fn(df,'flag'))
dnnclassifierresults = dnnclassifier.evaluate(input_fn=pd_input_fn(testdf,'flag'))
if dnnclassifierresults['accuracy'] > maxaccuracy:
maxaccuracy = dnnclassifierresults['accuracy']
bestunits = [x,y]
y = y + 1
del dnnclassifier
del dnnclassifierresults
gc.collect()
testbar.next()
x = x + 1
testbar.next()
testbar.finish()
print("Best Parameters: " + str(bestunits) + " units with " + str(maxaccuracy*100) + "% accuracy.")
答案 0 :(得分:2)
您可以使用multiprocessing并在新流程中运行逻辑,因此,当流程结束时,将释放所有相关资源。像这样:
import multiprocessing as mp
class TestNetworkProcess(mp.Process):
def __init__(self, x, y): # add other parameters
super(mp.Process, self).__init__() # don't forget
self.x, self.y = x, y
def run(self):
# your code here, e.g.
dnnclassifier = tf.estimator.DNNClassifier(hidden_units=[x,y])
self.accuracy = dnnclassifierresults['accuracy']
# initialization code
best_accuracy, best_xy = 0.0, None
for x in range(1, 100):
for y in range(1, 100):
proc = TestNetworkProcess(x, y)
proc.start()
proc.join()
if proc.accuracy > best_accuracy:
best_accuracy, best_xy = proc.accuracy, (x, y)
修改代码以传递训练和测试数据等。Python进程使用写时复制,因此您可以在__init__
中传递熊猫数据帧,而无需每次都重新加载它们。