Pandas.read_csv问题

时间:2018-12-28 11:08:26

标签: python pandas

我试图从数据库中读取消息,但是在类标签下不能真正读取与CSV数据集相同的消息。

  

messages = pandas.read_csv('bitcoin_reddit.csv',delimiter ='\ t',                              names = [“ title”,“ class”])   打印(消息)

Under the class label the pandas only can read as NaN

The version of my CSV file

在类标签下,大熊猫只能读为NaN

我的CSV文件的版本

title,url,timestamp,class
"It's official! 1 Bitcoin = $10,000 USD",https://v.redd.it/e7io27rdgt001,29/11/2017 17:25,0
The last 3 months in 47 seconds.,https://v.redd.it/typ8fdslz3e01,4/2/2018 18:42,0
It's over 9000!!!,https://i.imgur.com/jyoZGyW.gifv,26/11/2017 20:55,1
Everyone who's trading BTC right now,http://cdn.mutually.com/wp-content/uploads/2017/06/08-19.jpg,7/1/2018 12:38,1
I hope James is doing well,https://i.redd.it/h4ngqma643101.jpg,1/12/2017 1:50,1
Weeeeeeee!,https://i.redd.it/iwl7vz69cea01.gif,17/1/2018 1:13,0
Bitcoin.. The King,https://i.redd.it/4tl0oustqed01.jpg,1/2/2018 5:46,1
Nothing can increase by that much and still be a good investment.,https://i.imgur.com/oWePY7q.jpg,14/12/2017 0:02,1
"This is why I want bitcoin to hit $10,000",https://i.redd.it/fhzsxgcv9nyz.jpg,18/11/2017 18:25,1
Bitcoin Doesn't Give a Fuck.,https://v.redd.it/ty2y74gawug01,18/2/2018 15:19,-1
Working Hard or Hardly Working?,https://i.redd.it/c2o6204tvc301.jpg,12/12/2017 12:49,1

1 个答案:

答案 0 :(得分:1)

csv文件中的分隔符是逗号,而不是制表符。并且由于,是默认设置,因此无需定义它。

但是,names=为列定义了自定义名称。您的标题已经提供了这些名称,因此只需要将您感兴趣的列名称传递给usecols

>>> pd.read_csv(file, usecols=['title', 'class'])
                                               title  class
0             It's official! 1 Bitcoin = $10,000 USD      0
1                   The last 3 months in 47 seconds.      0
2                                  It's over 9000!!!      1
3               Everyone who's trading BTC right now      1
4                         I hope James is doing well      1
5                                         Weeeeeeee!      0