我有一个项目在Tensorflow中使用固定估计器,并尝试产生train_and_evaluate方法。
estimator = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=hidden_units,
model_dir=model_dir,
optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=0.01,l1_regularization_strength=0.001))
每次我看到控制台输出时,它都表明损失始终为零。
INFO:tensorflow:loss = 2.896826e-06, step = 875
INFO:tensorflow:global_step/sec: 5.96785
INFO:tensorflow:loss = 1.9453131e-05, step = 975 (16.756 sec)
INFO:tensorflow:global_step/sec: 7.2834
INFO:tensorflow:loss = 8.6957414e-05, step = 1075 (13.730 sec)
INFO:tensorflow:global_step/sec: 7.36042
INFO:tensorflow:loss = 0.0004585028, step = 1175 (13.586 sec)
INFO:tensorflow:global_step/sec: 7.38419
INFO:tensorflow:loss = 0.0012249642, step = 1275 (13.542 sec)
INFO:tensorflow:global_step/sec: 7.3658
INFO:tensorflow:loss = 0.002194246, step = 1375 (13.576 sec)
INFO:tensorflow:global_step/sec: 7.33054
INFO:tensorflow:loss = 0.0031063582, step = 1475 (13.641 sec)
发生这种情况是因为我更改了input_fn(我曾经将CSV加载到pandas Dataframe中并从那里工作,但是我的数据集总数超过10GB(尺寸为800x1500000),并且每次我用来保存模型时,模型文件夹的大小曾经非常疯狂(超过200GB),所以我决定改用迭代器(我在某个地方的教程中找到了此输入函数,并且效果很好):
def input_fn_train(filenames,
num_epochs=None,
shuffle=True,
skip_header_lines=0,
batch_size=200,
modeTrainEval=True):
filename_dataset = tf.data.Dataset.from_tensor_slices(filenames)
if shuffle:
filename_dataset = filename_dataset.shuffle(len(filenames))
dataset = filename_dataset.flat_map(lambda filename: tf.data.TextLineDataset(filename).skip(skip_header_lines))
dataset = dataset.map(parse_csv)
if shuffle:
dataset = dataset.shuffle(buffer_size=batch_size * 10)
dataset = dataset.repeat(num_epochs)
dataset = dataset.batch(batch_size)
iterator = dataset.make_one_shot_iterator()
features = iterator.get_next()
features, labels = features, features.pop(LABEL_COLUMN)
if not modeTrainEval:
return features, None
return features, labels
不幸的是,这种变化导致我的损失始终为零,结果是预测非常糟糕(准确度为50%),我找不到原因。
(带有示例数据集和我的代码的github link)