Question

我想阅读一个海量数据集： AAB，20170525,0.13,0.14,0.13,0.14,2060等。

import tensorflow as tf

filename_queue = tf.train.string_input_producer(["D:/data/20170623.csv"])

reader = tf.TextLineReader(skip_header_lines=1)
key, value = reader.read(filename_queue)

record_defaults = [tf.constant([], dtype= tf.int32),
                   tf.constant([], dtype= tf.int32),
                   tf.constant([], dtype=tf.int32),
                   tf.constant([], dtype=tf.int32),
                   tf.constant([], dtype=tf.int32),
                   tf.constant([], dtype=tf.int32),
                   tf.constant([], dtype=tf.int32)]

col1, col2, col3, col4, col5, col6, col7 = tf.decode_csv(value, record_defaults=record_defaults)
assert col1.dtype == tf.int32  
assert col2.dtype == tf.int32  
assert col3.dtype == tf.int32    
assert col4.dtype == tf.int32    
assert col5.dtype == tf.int32    
assert col6.dtype == tf.int32  
assert col7.dtype == tf.int32    

features = tf.stack([tf.to_string(col1), tf.to_string(col2)])

features = tf.stack([col1, col2, col3, col4, col5, col6, col7])

with tf.Session() as sess:
  # Start populating the filename queue.
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  for i in range(1200):
    # Retrieve a single instance:
    example, label = sess.run([features, col1])

  coord.request_stop()
  coord.join(threads)

错误：

Traceback (most recent call last):

  File "<ipython-input-18-8079bf3fc932>", line 1, in <module>
    runfile('D:/data/temp.py', wdir='D:/data')

  File "D:\Anaconda\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
    execfile(filename, namespace)

  File "D:\Anaconda\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "D:/data/temp.py", line 33, in <module>
    features = tf.stack([tf.to_string(col1), tf.to_string(col2)])

AttributeError: module 'tensorflow' has no attribute 'to_string'

如何从csv文件返回基于名称的值？包含日期，单词，价值观，花车？

感谢您的帮助。

Answer 1

您应该通过更改dtype张量在tf.decode_csv步骤中指定正确的record_defaults，以便TensorFlow知道它应该为每列提供哪个dtype。这样您就不需要在之后进行更改。

E.g。你可以这样做：

record_defaults = [[''],
                   [''],
                   [0.0],
                   [0.0],
                   [0.0],
                   tf.constant([], dtype=tf.int32),
                   tf.constant([], dtype=tf.int32)]

Answer 2

一种解决方案是导入pandas。

如果你正在使用anaconda。

conda install -c anaconda pandas=0.20.2

等到安装了Pandas，或者在anaconda下搜索pandas＆gt; tensorflow。

代码是：

import pandas as pd
df=pd.read_csv("csvdirectoryhere")
print(df)

如果您想要导入到python中的多个数据集，可以将df更改为df1。 print（df1）df1 = pd.read_csv（“csvdirectoryhere”）df1等等。

要使用导入的数据，您可以like this执行某项操作，或按照docs进行操作。

特别感谢回答我问题的所有人。

模块TensorFlow没有属性'to_string'

2 个答案: