将CSV文件输入TensorFlow以构建神经网络

时间:2017-06-08 03:57:52

标签: python csv tensorflow

以下是我的CVS文件输入格式

Feature1, Feature2, ... Feature5, Label
Feature1, Feature2, ... Feature5, Label

根据Internet上的教程,我将输入文件从MINST数据库修改为我的CVS文件,但是我收到了错误消息。

File "test5.py", line 95, in <module>
y: batch_y})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 938, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/usr/local/lib/python2.7/dist-packages/numpy/core/numeric.py", line 
531, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

我想知道我做错了什么

以下是我的代码

import tensorflow as tf

filename_queue = tf.train.string_input_producer(["file0.csv"])


line_reader = tf.TextLineReader(skip_header_lines=1)
_, csv_row = line_reader.read(filename_queue)


# Type information and column names based on the decoded CSV.
[[0.0],[0.0],[0.0],[0.0],[0.0],[""]]

record_defaults = [[0.0],[0.0],[0.0],[0.0],[0.0],[0]]
in1,in2,in3,in4,in5,out = tf.decode_csv(csv_row, record_defaults=record_defaults)

# Turn the features back into a tensor.
features = tf.stack([in1,in2,in3,in4,in5])

# Parameters
learning_rate = 0.001
training_epochs = 5
batch_size = 3
display_step = 1
num_examples= 15

# Network Parameters
n_hidden_1 = 6 # 1st layer number of features
n_hidden_2 = 6 # 2nd layer number of features
n_input = 5 # MNIST data input (img shape: 28*28)
n_classes = 2 # MNIST total classes (0-9 digits)

# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])


# Create model
def multilayer_perceptron(x, weights, biases):
    # Hidden layer with RELU activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    # Hidden layer with RELU activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer


# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# Construct model
pred = multilayer_perceptron(x, weights, biases)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()


with tf.Session() as sess:
    #tf.initialize_all_variables().run()
    sess.run(init)
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(num_examples/batch_size)
        # Loop over all batches

        for i in range(total_batch):
            batch_x = []
            batch_y = []
            for iteration in range(1, batch_size):
                example, label = sess.run([features, out])
                batch_x.append(example)
                #batch_y.append(label)
                onehot_labels = tf.one_hot(indices=tf.cast(label, tf.int32), depth=2)
                batch_y.append(onehot_labels)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                                          y: batch_y})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if epoch % display_step == 0:
            print ("Epoch:", '%04d' % (epoch+1), "cost=", \
                "{:.9f}".format(avg_cost))
    print ("Optimization Finished!")
    coord.request_stop()
    coord.join(threads)

1 个答案:

答案 0 :(得分:0)

您无法将张量输入占位符变量。您有两种选择:

  1. 更改张量y变量的形状以接受标签。然后在张量上调用tf.one_hot

    y = tf.placeholder(tf.int32, [None])
    y_one_hot = tf.one_hot(y, 2)
    ...
    # batch_labels is a list of integer labels
    _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_labels})
    
  2. 保持张量y不变,并使用不同的类(本例中为numpy数组)获取一个热源值:

    # batch_labels is a list of integer labels
    batch_labels_one_hot = np.zeros((batch_size, 2))
    batch_labels_one_hot[list(range(batch_size)), batch_labels] = 1
    _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_labels_one_hot})