I was trying to get my hands dirty with Tensorflow and following Wide and Deep Learning example code. I modified certain imports for it to work with python 3.4 on centos 7.
Highlights of the changes are:
-import urllib
+import urllib.request
...
-urllib.urlretrieve
+urllib.request.urlretrieve
...
On running the code, I am getting following error
Training data is downloaded to /tmp/tmpw06u4_xl
Test data is downloaded to /tmp/tmpjliqxhwh
model directory = /tmp/tmpcyll7kck
WARNING:tensorflow:Setting feature info to {'education': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'capital_gain': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False), 'capital_loss': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False), 'hours_per_week': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False), 'gender': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'occupation': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'native_country': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'race': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'age': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False), 'education_num': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False), 'marital_status': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'workclass': TensorSignature(dtype=tf.string, shape=None, is_sparse=True), 'relationship': TensorSignature(dtype=tf.string, shape=None, is_sparse=True)}
WARNING:tensorflow:Setting targets info to TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(32561)]), is_sparse=False)
Traceback (most recent call last):
File "wide_n_deep_tutorial.py", line 213, in <module>
tf.app.run()
File "/usr/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "wide_n_deep_tutorial.py", line 209, in main
train_and_eval()
File "wide_n_deep_tutorial.py", line 202, in train_and_eval
m.fit(input_fn=lambda: input_fn(df_train), steps=FLAGS.train_steps)
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 240, in fit
max_steps=max_steps)
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 550, in _train_model
train_op, loss_op = self._get_train_ops(features, targets)
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 182, in _get_train_ops
logits = self._logits(features, is_training=True)
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 260, in _logits
dnn_feature_columns = self._get_dnn_feature_columns()
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 224, in _get_dnn_feature_columns
feature_column_ops.check_feature_columns(self._dnn_feature_columns)
File "/usr/lib/python3.4/site-packages/tensorflow/contrib/layers/python/layers/feature_column_ops.py", line 318, in check_feature_columns
f.name))
ValueError: Duplicate feature column key found for column: education_embedding. This usually means that the column is almost identical to another column, and one must be discarded.
Is that I have change some variable or is this a python 3 problem. How can I get going forward with this tutorial.
答案 0 :(得分:1)
最终更新我在推荐的0.10rc0分支上遇到了这个问题,但是在使用master重新安装后(git clone上没有分支),这个问题就消失了。我检查了源代码并修复了它。在修复了你已经提到过的urllib.request之后,Python 3现在获得与用于wide_n_deep模式的Python 2相同的结果。
对于后来仍然使用0.10rc0分支的人,请随时阅读:
有同样的问题,并做了一些调试。看起来像_EmbeddingColumn类中的tensorflow / contrib / layers / python / layers / feature_column.py中的错误。密钥(self)属性受到这个bug的困扰: https://bugs.python.org/issue24931
因此,我们不是使用一个很好的唯一键,而是为所有_EmbeddingColumn实例获取以下键: &#39; _EmbeddingColumn()&#39;
这会导致feature_column_ops.py的check_feature_columns()函数确定第二个_EmbeddingColumn实例是重复的,因为它们的所有键都相同。
我是一个Python菜鸟,我无法弄清楚如何修补一个属性。所以我通过在wide_n_deep教程文件的顶部创建一个子类来解决这个问题:
# EmbeddingColumn for Python 3.4 has a problem with key property
# can't monkey patch a property, so subclass it and make a method to create the
# subclass to use instead of "embedding_column"
from tensorflow.contrib.layers.python.layers.feature_column import _EmbeddingColumn
class _MonkeyEmbeddingColumn(_EmbeddingColumn):
# override the key property
@property
def key(self):
return "{}".format(self)
def monkey_embedding_column(sparse_id_column,
dimension,
combiner="mean",
initializer=None,
ckpt_to_load_from=None,
tensor_name_in_ckpt=None):
return _MonkeyEmbeddingColumn(sparse_id_column, dimension, combiner, initializer, ckpt_to_load_from, tensor_name_in_ckpt)
然后找到这样的电话:
tf.contrib.layers.embedding_column(workclass, dimension=8)
并替换&#34; tf.contrib.layers。&#34;用&#34;猴子_&#34;所以你现在有:
deep_columns = [
monkey_embedding_column(workclass, dimension=8),
monkey_embedding_column(education, dimension=8),
monkey_embedding_column(marital_status,
dimension=8),
monkey_embedding_column(gender, dimension=8),
monkey_embedding_column(relationship, dimension=8),
monkey_embedding_column(race, dimension=8),
monkey_embedding_column(native_country,
dimension=8),
monkey_embedding_column(occupation, dimension=8),
age,
education_num,
capital_gain,
capital_loss,
hours_per_week,
]
所以现在它使用带有修改后的键属性的MonkeyEmbeddingColumn类(与feature_column.py中的所有其他键属性一样)。这可以让代码运行完成,但我不能100%确定它是否正确,因为它报告的准确性为:
accuracy: 0.818316
由于这比仅限广泛的训练略差,我想知道它是否在Python 2中具有这种准确性,或者我的修复是否因为导致训练问题而降低了准确性。
更新我在Python 2中安装并且wide_n_deep的准确度超过0.85,所以这个&#34;修复&#34;让代码运行但似乎做错了。我将调试并查看Python 2为这些值获取的内容,看看它是否可以在Python 3中正确修复。我也很好奇。
答案 1 :(得分:0)
首先,当您从tensorflow网站下载教程源代码时,Jesse建议使用master(git clone上没有分支)。 其次,您只需要将前两个参数传递给该类。你会在途中得到一些警告;只是忽略它们并等待程序完成。 因此,在教程源代码的基础上使用以下代码:或者下载 修改后的代码:Modified Version
from tensorflow.contrib.layers.python.layers.feature_column import _EmbeddingColumn
class _MonkeyEmbeddingColumn(_EmbeddingColumn):
# override the key property
@property
def key(self):
return "{}".format(self)
def monkey_embedding_column(sparse_id_column,
dimension,
combiner="mean",
initializer=None,
ckpt_to_load_from=None,
tensor_name_in_ckpt=None):
return _MonkeyEmbeddingColumn(sparse_id_column, dimension)