在我的数据集上运行TensorFlow中的wide_n_deep_tutorial程序时,会显示以下错误。
"TypeError: Signature mismatch. Keys must be dtype <dtype: 'string'>, got <dtype:'int64'>"
以下是代码段:
def input_fn(df):
"""Input builder function."""
# Creates a dictionary mapping from each continuous feature column name (k) to
# the values of that column stored in a constant Tensor.
continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS}
# Creates a dictionary mapping from each categorical feature column name (k)
# to the values of that column stored in a tf.SparseTensor.
categorical_cols = {k: tf.SparseTensor(
indices=[[i, 0] for i in range(df[k].size)],
values=df[k].values,
shape=[df[k].size, 1])
for k in CATEGORICAL_COLUMNS}
# Merges the two dictionaries into one.
feature_cols = dict(continuous_cols)
feature_cols.update(categorical_cols)
# Converts the label column into a constant Tensor.
label = tf.constant(df[LABEL_COLUMN].values)
# Returns the feature columns and the label.
return feature_cols, label
def train_and_eval():
"""Train and evaluate the model."""
train_file_name, test_file_name = maybe_download()
df_train=train_file_name
df_test=test_file_name
df_train[LABEL_COLUMN] = (
df_train["impression_flag"].apply(lambda x: "generated" in x)).astype(str)
df_test[LABEL_COLUMN] = (
df_test["impression_flag"].apply(lambda x: "generated" in x)).astype(str)
model_dir = tempfile.mkdtemp() if not FLAGS.model_dir else FLAGS.model_dir
print("model directory = %s" % model_dir)
m = build_estimator(model_dir)
print('model succesfully build!')
m.fit(input_fn=lambda: input_fn(df_train), steps=FLAGS.train_steps)
print('model fitted!!')
results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
for key in sorted(results):
print("%s: %s" % (key, results[key]))
感谢任何帮助。
答案 0 :(得分:0)
将有助于查看错误消息之前的输出,以确定此错误跳过的进程的哪个部分,但是,该消息非常清楚地表明密钥应该是字符串,而是给出了整数。我只是猜测,但是在脚本的前面部分是否正确列出了列名,因为它们可能是这个实例中引用的键?
答案 1 :(得分:0)
根据your traceback判断,您遇到的问题是由您对要素列的输入或input_fn
的输出造成的。您的稀疏张量最有可能被赋予values
参数的非字符串dtypes;稀疏特征列期望字符串值。确保您提供正确的数据,如果您确定自己是,则可以尝试以下操作:
categorical_cols = {k: tf.SparseTensor(
indices=[[i, 0] for i in range(df[k].size)],
values=df[k].astype(str).values, # Convert sparse values to string type
shape=[df[k].size, 1])
for k in CATEGORICAL_COLUMNS}
答案 2 :(得分:0)
这就是我解决这一挑战的方法:
from sklearn.model_selection import train_test_split
# split the data set
X_train, X_test, y_train, y_test = train_test_split(M, N, test_size=0.3)
# covert string to int64 for training set
X_train = X_train[X_train.columns] = X_train[X_train.columns].apply(np.int64)
y_train = y_train.apply(np.int64)
# covert string to int64 for testing set
X_test = X_test[X_test.columns] = X_test[X_test.columns].apply(np.int64)
y_test = y_test.apply(np.int64)