列表中的项目数不正确python

时间:2018-02-25 17:59:47

标签: python list

我有一个推文的文本文件,如下所示:

1   1   Sweet United Nations video. Just in time for Christmas. #imagine #NoReligion 
2   1   @mrdahl87 We are rumored to have talked to Erv..that's hardly nothing    ;)
3   1   Hey there! Nice to see you Winter Weather 
4   0   3 episodes left I'm dying over here

我有这段代码:

import csv

with open('./data/train.txt',encoding="utf8") as inf:
    reader = csv.reader(inf, delimiter='\t')
    col1 = list(zip(*reader))[0]

c = 0
for x in col1:
    c = c+1
    print(x , "  " , c)

当我打印我的列表长度时显示3817,但实际的项目数是3834 !! 我添加了一个计数器“C”来检查和计数过程,它也给了我3817 !!

我通过打印行手动检查了文件:

file_lines  counter_c
1643        1643
1644        1644
1645        1645
1649        1646 <-----
1650        1647

我发现文件阅读器跳过了一些行,如1646,1647,1648 !!

他们就是这些:

1645    0   "@SchmidtSTL: Thanks to @automaticg. I think I backed into the playoffs! Playoff matchup: TacoCorp v Gronkey Punch 
1646    0   oh yeah,  its official #im  #crazy htt.co/bgcLDJQIR6
1647    1   Oh well, looks like we are back to square 1. Batie and Bridge. This is going to go so well    #BoldandBeautiful
1648    1   "@antoineraps: @edifyin how are you up now? Ho!"

有什么问题?!

编辑(添加推文1645)

我发现推文1645有问题!它是什么? 或者我怎样才能在阅读文本时解决它?

0 个答案:

没有答案