Tensorflow无法解码csv

时间:2018-01-31 09:13:18

标签: python tensorflow machine-learning deep-learning keras

我从以下书中获得了以下代码" TensorFlow 1.x Deep Learning Cookbook",其中数据集取自http://lib.stat.cmu.edu/datasets/boston

import tensorflow as tf

# Global Parameters
DATA_FILE = 'boston_housing.csv' 
BATCH_SIZE = 10
NUM_FEATURES = 14

# Define function thaat takes the file name and returns
# tensors in batches of size equal to batch_size
def data_generator(filename):
    '''
    Generates Tensors in batches of size BATCH_SIZE.
    Args: String Tensor
    Filename from which data is to be read
    Returns: Tensors
    feature_batch and label_batch
    '''
    # Define the filename that is "f_queue" and "reader"
    f_queue = tf.train.string_input_producer(filename)
    reader  = tf.TextLineReader(skip_header_lines=1)   # Skips the first line
    _, value = reader.read(f_queue)

    # Specify data to use in case of missing data. Decode the csv and select the features we need. For instance
    # We choose RM, PTRATIO, LSTAT.
    record_defaults = [ [0.0] for _ in range(NUM_FEATURES)]
    data = tf.decode_csv(value, record_defaults = record_defaults)
    features = tf.stack(tf.gather_nd(data, [[5], [10], [12]]))
    label = data[-1]

    # Define parameters to generate batch and use tf.train.shuffle_batch() for randomly
    # shuffling tensors. The function returns the tensors-- feature_batch and label_batch

    # Minimum number of elements in the queue after a dequeue
    min_after_dequeue=10*BATCH_SIZE

    # Maximum number of element in the queue
    capacity = 20 * BATCH_SIZE

    # Shuffle the data to generate BATCH_SIZE sample pairs
    feature_batch, label_batch = tf.train.shuffle_batch([features, label], 
                                                         batch_size = BATCH_SIZE,
                                                         capacity = capacity,
                                                         min_after_dequeue = min_after_dequeue)
    return feature_batch, label_batch


# Function that generate the batches in the session 
def generate_data(feature_batch, label_batch):
    with tf.Session() as sess:
        # initialize the queue threads
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
        for _ in range(5): # generate 5 batches
            features, labels = sess.run([feature_batch, label_batch])
            print(features, 'HI')
        coord.request_stop()
        coord.join(threads)

# Run
if __name__ == '__main__':
    feature_batch, label_batch = data_generator([DATA_FILE])
    generate_data(feature_batch, label_batch)

但是,当我运行此操作时,我会收到以下错误

  

INFO:tensorflow:向协调员报告错误:,期望14个字段但在记录0中有1个            [[节点:DecodeCSV_1 = DecodeCSV [OUT_TYPE = [DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT,DT_FLOAT],field_delim =",&#34 ;,na_value ="",use_quote_delim = true,_device =" / job:localhost / replica:0 / task:0 / device:CPU:0"](ReaderReadV2_1:1, DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0,DecodeCSV_1 / record_defaults_0)]]

     

OutOfRangeError:RandomShuffleQueue' _41_shuffle_batch_4 / random_shuffle_queue'关闭且元素不足(请求10,当前大小0)        [[Node:shuffle_batch_4 = QueueDequeueManyV2 [component_types = [DT_FLOAT,DT_FLOAT],timeout_ms = -1,_device =" / job:localhost / replica:0 / task:0 / device:CPU:0"]( shuffle_batch_4 / random_shuffle_queue,shuffle_batch_4 / n)]]

我对tensorflow很新,这本书几乎没有解释发生了什么......这应该适用于Python 3.5和Tensorflow 1.3。你能否至少指出我正确的方向?

0 个答案:

没有答案