我试图在一些简单的数据上运行DNNRegressor,所以我可以测试它的准确性,该模型应该采取任何银行交易并尝试预测其价格,但我得到一些奇怪的结果,我认为这是一个错误的结果用代码。
我的代码在这里:
from __future__ import print_function
from __future__ import division
from __future__ import absolute_import
import itertools
import pandas as pd
import numpy as np
import tensorflow as tf
print('Running version of tensorflow')
print(tf.__version__)
tf.logging.set_verbosity(tf.logging.DEBUG)
names = [
'trans',
'price',
]
predict_names = [
'trans'
]
dtypes = {
'trans': str,
'price': np.float32,
}
df = pd.read_csv('simple.csv', names=names, dtype=dtypes, na_values='?')
# Split the data into a training set and an eval set.
training_data = df[:50]
eval_data = df[50:]
print("Training with this :\n")
print(training_data)
# Separate input features from labels
training_label = training_data.pop('price')
eval_label = eval_data.pop('price')
# Feature Columns
training_input_fn = tf.estimator.inputs.pandas_input_fn(x=training_data, y=training_label, batch_size=1, shuffle=True, num_epochs=None)
eval_input_fn = tf.estimator.inputs.pandas_input_fn(x=eval_data, y=eval_label, batch_size=1, shuffle=False, num_epochs=None)
#Embed the column since its a string
transformed_trans = tf.feature_column.categorical_column_with_hash_bucket('trans', 50)
print("Transformed words **********************")
print(transformed_trans)
dnn_features = [tf.feature_column.indicator_column(transformed_trans)]
# regressor = tf.contrib.learn.LinearRegressor(feature_columns=[trans])
dnnregressor = tf.contrib.learn.DNNRegressor(feature_columns=dnn_features, hidden_units=[50, 30, 10])
#train the model
dnnregressor.fit(input_fn=training_input_fn, steps=1)
# Evaluate the trianing
dnnregressor.evaluate(input_fn=eval_input_fn, steps=1)
# Predictions
predictdf = pd.read_csv('simple_predict.csv', names=names, dtype=dtypes, na_values='?')
predict_input_fn = tf.estimator.inputs.pandas_input_fn(x=predictdf,shuffle=False, num_epochs=1)
print("Predicting scores **********************")
y = dnnregressor.predict_scores(input_fn=predict_input_fn)
for x in y:
print(str(x)+"\n")
我的数据看起来像这样
simple.csv:
Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4
Cafe,10
Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4
Cafe,10
Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4
simple_predict.csv:
Uber
Food
我认为使用以下数据集是可以预测的,我将获得0损失并且预测将在点上。 但事实并非如此,我总是对Uber和Food有一个完全相同的预测,我甚至无法理解我得到的结果。
我错过了代码中的任何内容吗?或者我错过了解DNNRegressor应该如何工作?
答案 0 :(得分:0)
上述代码的问题是散列率太低。 因此,通过简单地将散列率提高到300而不是50,它将全部起作用。
transformed_trans = tf.feature_column.categorical_column_with_hash_bucket('trans', 300)