Tensorflow DNNRegressor predict_score输出

时间:2017-09-12 02:52:02

标签: python machine-learning tensorflow

我试图在一些简单的数据上运行DNNRegressor,所以我可以测试它的准确性,该模型应该采取任何银行交易并尝试预测其价格,但我得到一些奇怪的结果,我认为这是一个错误的结果用代码。

我的代码在这里:

from __future__ import print_function
from __future__ import division
from __future__ import absolute_import

import itertools
import pandas as pd
import numpy as np
import tensorflow as tf

print('Running version of tensorflow')
print(tf.__version__)

tf.logging.set_verbosity(tf.logging.DEBUG)

names = [
    'trans',
    'price',
]

predict_names = [
    'trans'
]

dtypes = {
    'trans': str,
    'price': np.float32,
}

df = pd.read_csv('simple.csv', names=names, dtype=dtypes, na_values='?')

# Split the data into a training set and an eval set.
training_data = df[:50]
eval_data = df[50:]
print("Training with this :\n")
print(training_data)

# Separate input features from labels
training_label = training_data.pop('price')
eval_label = eval_data.pop('price')

# Feature Columns
training_input_fn = tf.estimator.inputs.pandas_input_fn(x=training_data, y=training_label, batch_size=1, shuffle=True, num_epochs=None)

eval_input_fn = tf.estimator.inputs.pandas_input_fn(x=eval_data, y=eval_label, batch_size=1, shuffle=False, num_epochs=None)

#Embed the column since its a string
transformed_trans = tf.feature_column.categorical_column_with_hash_bucket('trans', 50)
print("Transformed words **********************")
print(transformed_trans)

dnn_features = [tf.feature_column.indicator_column(transformed_trans)]
# regressor = tf.contrib.learn.LinearRegressor(feature_columns=[trans])

dnnregressor = tf.contrib.learn.DNNRegressor(feature_columns=dnn_features, hidden_units=[50, 30, 10])

#train the model
dnnregressor.fit(input_fn=training_input_fn, steps=1)

# Evaluate the trianing
dnnregressor.evaluate(input_fn=eval_input_fn, steps=1)




# Predictions
predictdf = pd.read_csv('simple_predict.csv', names=names, dtype=dtypes, na_values='?')
predict_input_fn = tf.estimator.inputs.pandas_input_fn(x=predictdf,shuffle=False, num_epochs=1)

print("Predicting scores **********************")


y = dnnregressor.predict_scores(input_fn=predict_input_fn)
for x in y:
    print(str(x)+"\n")

我的数据看起来像这样

simple.csv:

Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4
Cafe,10
Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4
Cafe,10
Uber,4
Food,12
Coffee,4
Cafe,10
Coffee,4


simple_predict.csv:

Uber
Food

我认为使用以下数据集是可以预测的,我将获得0损失并且预测将在点上。 但事实并非如此,我总是对Uber和Food有一个完全相同的预测,我甚至无法理解我得到的结果。

我错过了代码中的任何内容吗?或者我错过了解DNNRegressor应该如何工作?

1 个答案:

答案 0 :(得分:0)

上述代码的问题是散列率太低。 因此,通过简单地将散列率提高到300而不是50,它将全部起作用。

transformed_trans = tf.feature_column.categorical_column_with_hash_bucket('trans', 300)