如何正确地为张量流逻辑回归程序塑造我的张量

时间:2017-10-21 18:14:22

标签: python tensorflow

我有一个简单的程序,应该为某些数据创建一个逻辑回归训练模型。

有一个输出类y(0 = false,1 = true) 有25个功能 我正在努力正确定义我的变量和占位符形状 这是代码。

#!/usr/bin/env python3

import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn import model_selection
import matplotlib.pyplot as plt
import seaborn as sns
import sys

sns.set(style='white')
sns.set(style='whitegrid',color_codes=True)



bank_data = pd.read_csv('data/bank.csv',header=0,delimiter = ';')
bank_data = bank_data.dropna()

bank_data.drop(bank_data.columns[[0,3,8,9,10,11,12,13]],axis=1,inplace=True)
data_set = pd.get_dummies(bank_data,columns = ['job','marital','default','housing','loan','poutcome'])
data_set.drop(data_set.columns[[14,27]],axis=1,inplace=True)
data_set_y = data_set['y']
data_set_y.replace(('yes','no'),(1.0,0.0),inplace=True)
data_set_X = data_set.drop(['y'],axis=1)
num_samples = data_set.shape[0]
num_features = data_set_X.shape[1]
print ('num_features = ', num_features)


X = tf.placeholder('float',[None,num_features])
y = tf.placeholder('float',[None,1])

W = tf.Variable(tf.zeros([num_features,1]),dtype=tf.float32)
b = tf.Variable(tf.zeros([1]),dtype=tf.float32)

train_X,test_X,train_y,test_y = model_selection.train_test_split(data_set_X,data_set_y,random_state=0)

print (train_y.head())
print (train_X.head())

prediction = tf.add(tf.matmul(X,W),b)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
num_epochs = 1000


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(num_epochs):
        _,l = sess.run([optimizer,cost],feed_dict = {X: train_X, y: train_y})
        if epoch % 50 == 0:
            print ('loss = %f' % (l))

我得到的当前错误是: ValueError:无法为Tensor' Placeholder_1:0'提供形状值(3390,),其形状为'(?,1)'

y_train是一个只包含0或1的熊猫系列。 我是否需要将y_train重塑为两个单热矢量并相应地更改y占位符的尺寸?

这是y训练数据的头部输出。 4384 0.0 2560 0.0 1470 0.0 1771 0.0 2604 0.0

必须处理塑造我的张量正在成为一个严重的噩梦。 任何帮助赞赏。

1 个答案:

答案 0 :(得分:0)

您应该将train_y从1维张量转换为2维。例如,添加以下行:

....
train_X,test_X,train_y,test_y 
    = model_selection.train_test_split(data_set_X,data_set_y,random_state=0)
train_y = np.reshape(train_y, (-1,1))
....