如何将每个Y的多个X数据值提供给TensorFlow NN

时间:2018-02-06 12:44:47

标签: python tensorflow

我有两组尺寸为36 * 3的numpy数组数据。两个数组的每一行对应于相同的Y值。我需要将numpy数组1的每一行输入到回归NN中以生成一个值,然后对所有这些进行求和。这需要针对numpy数组2重复。单个Y值的最终预测是两个预测​​总和之间的差异。

我不确定如何将这些x行数据(3个输入节点)值提供给X占位符,将每个y值转换为Y占位符,然后在TF会话中运行。因为下面的整个代码只对应一个有效的数据点;函数get_dataset(1)给出了这个。这就是当前加载一个数据点的方式。

我想知道其他人如何处理这个合适的问题。基本上我不知道如何格式化x数据。每个Y值有多个输入。

data = get_dataset(1)  # 36*6 np array corresponding to one Y value

ideal_data = data[:,[0,1,4]] # ideal and displaced data are in these columns (0,1) and (2,3)
ideal_data = ideal_data.tolist() #flatten

displaced_data = data[:,[2,3,4]]
displaced_data = displaced_data.tolist()

y = data[0][5]

y_data = tf.convert_to_tensor(y)

for i in range(36): # get each row i.e. X(i) datapoints
  ideal_data_tf = tf.convert_to_tensor(ideal_data[i])
  displaced_data_tf = tf.convert_to_tensor(displaced_data[i])

我的回归NN目前被定义为以下函数,带有X占位符:

with tf.name_scope("Training_Neural_Network"):
#Training Computation
  def training_multilayer_perceptron(X, weights, biases): #dropout should only be used during training, not during evaluation
    with tf.name_scope("Layer1"):
      layer_1 = tf.add(tf.matmul(X, weights['W1']), biases['b1'])
      layer_1 = tf.nn.relu(layer_1)
      layer_1 = tf.nn.dropout(layer_1,keep_prob)
    with tf.name_scope("Layer2"):
      layer_2 = tf.add(tf.matmul(layer_1, weights['W2']), biases['b2'])
      layer_2 = tf.nn.relu(layer_2)
      layer_2 = tf.nn.dropout(layer_2,keep_prob)
    with tf.name_scope("Layer3"):
      out_layer = tf.add(tf.matmul(layer_2, weights['W3']), biases['b3'])
      return out_layer

P.s这是我在SO上发布的第一个问题,所以对我如何能更好地提出问题的任何评论都将不胜感激。

1 个答案:

答案 0 :(得分:0)

我可能不理解这个问题,但是我已经在我的硕士论文中实现了非常类似的东西,所以我将在下面向你展示代码(带有良好的结构化评论)。

df = read_excel('Train_Data.xlsx')

# convert dataframe into array
data = np.asarray(df, dtype=np.int64)

# train data (x) contains all rows and all columns except the last
x = data[:, :-1]
# label data (y) contains all rows and only the last column 
y = data[:, -1]

# label data is reshaped to fit the right format
y = np.reshape(y, [y.shape[0], 1])

# both datasets are shuffled to simplify split of train and test data
permutation = np.random.permutation(x.shape[0])
x = x[permutation]
y = y[permutation]

# test data ratio is determined
test_size = 0.1

# train data is sliced from list of total data, test data equals the rest
num_test = int(test_size * len(data))
X_train = x[:-num_test]
X_test = x[-num_test:]

# the same applies for the label values
Y_train = y[:-num_test]
Y_test = y[-num_test:]

[...]

# defining placeholder variables for model
x = tf.placeholder("float", [None, 7])
y = tf.placeholder("float", [None, 1])

x的占位符定义采用输入数据的形状,因此如果你有36 * 6 = 216个与1个标签对应的特征,你将设置

x = tf.placeholder("float", [None, 216])
y = tf.placeholder("float", [None, 1])

之后你在会话中填写占位符(我已经批量实现了它)

# launch the graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(total_len/batch_size)
        # loop over all batches
        for i in range(total_batch-1):
            batch_x = X_train[i*batch_size:(i+1)*batch_size]
            batch_y = Y_train[i*batch_size:(i+1)*batch_size]
            # run optimization (backprop) and cost op (to get loss value)
            _, c, p = sess.run([optimizer, cost, pred], feed_dict={x: batch_x,
                                                                   y: batch_y})