问题:
我正在建立一个预测房价的模型。所以,首先是我 决定在Tensorflow中构建线性回归模型。但是当我 开始训练我看到我的准确性总是1
我是机器学习的新手。请有人告诉我出了什么问题我无法弄明白。我在谷歌搜索但没有找到解决我问题的答案。
这是我的代码
df_train = df_train.loc[:, ['OverallQual', 'GrLivArea', 'GarageArea', 'SalePrice']]
df_X = df_train.loc[:, ['OverallQual', 'GrLivArea', 'GarageArea']]
df_Y = df_train.loc[:, ['SalePrice']]
df_yy = get_dummies(df_Y)
print("Shape of df_X: ", df_X.shape)
X_train, X_test, y_train, y_test = train_test_split(df_X, df_yy, test_size=0.15)
X_train = np.asarray(X_train).astype(np.float32)
X_test = np.asarray(X_test).astype(np.float32)
y_train = np.asarray(y_train).astype(np.float32)
y_test = np.asarray(y_test).astype(np.float32)
X = tf.placeholder(tf.float32, [None, num_of_features])
y = tf.placeholder(tf.float32, [None, 1])
W = tf.Variable(tf.zeros([num_of_features, 1]))
b = tf.Variable(tf.zeros([1]))
prediction = tf.add(tf.matmul(X, W), b)
num_epochs = 20000
# calculating loss
cost = tf.reduce_mean(tf.losses.softmax_cross_entropy(onehot_labels=y, logits=prediction))
optimizer = tf.train.GradientDescentOptimizer(0.00001).minimize(cost)
correct_prediction = tf.equal(tf.argmax(prediction, axis=1), tf.argmax(y, axis=1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(num_epochs):
if epoch % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={X: X_train, y: y_train})
print('step %d, training accuracy %g' % (epoch, train_accuracy))
optimizer.run(feed_dict={X: X_train, y: y_train})
print('test accuracy %g' % accuracy.eval(feed_dict={
X: X_test, y: y_test}))
输出是:
step 0, training accuracy 1
step 100, training accuracy 1
step 200, training accuracy 1
step 300, training accuracy 1
step 400, training accuracy 1
step 500, training accuracy 1
step 600, training accuracy 1
step 700, training accuracy 1
............................
............................
step 19500, training accuracy 1
step 19600, training accuracy 1
step 19700, training accuracy 1
step 19800, training accuracy 1
step 19900, training accuracy 1
test accuracy 1
编辑: 我将成本函数更改为此
cost = tf.reduce_sum(tf.pow(prediction-y, 2))/(2*1241)
但我的输出仍然是1.
编辑2: 回应lejlot评论: 谢谢lejlot。我将准确度代码更改为此
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
merged_summary = tf.summary.merge_all()
writer = tf.summary.FileWriter("/tmp/hpp1")
writer.add_graph(sess.graph)
for epoch in range(num_epochs):
if epoch % 5:
s = sess.run(merged_summary, feed_dict={X: X_train, y: y_train})
writer.add_summary(s, epoch)
sess.run(optimizer,feed_dict={X: X_train, y: y_train})
if (epoch+1) % display_step == 0:
c = sess.run(cost, feed_dict={X: X_train, y: y_train})
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))
print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: X_train, y: y_train})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')
但输出全部 nan 的输出:
....................................
Epoch: 19900 cost= nan W= nan b= nan
Epoch: 19950 cost= nan W= nan b= nan
Epoch: 20000 cost= nan W= nan b= nan
Optimization Finished!
Training cost= nan W= nan b= nan
答案 0 :(得分:-1)
您想使用线性回归,但实际上使用的是logistic regression。看看tf.losses.softmax_cross_entropy
:它输出概率分布,即总和为1
的数字向量。在您的情况下,向量具有size=1
,因此它始终输出[1]
。
以下两个示例将帮助您了解差异:linear regression和logistic regression。