Python csv reader不完整的文件行迭代

时间:2017-02-03 15:37:06

标签: python python-2.7 csv parsing

这是我的问题。我需要解析一个以逗号分隔的文件,并且我的代码工作正常,但是在测试它并尝试破解时我遇到了一个问题。

以下是示例代码:

import csv
compareList=["testfield1","testfield2","testfield3","testfield4"]
z=open("testFile",'r')
x=csv.reader(z,quotechar='\'')
testDic={}
iter=0
for lineList in x:
    try:
        for item in compareList:
            testDic[item]=lineList[iter]
            iter+=1
        iter=0
    except IndexError:
        iter=0
        lineList=[]
        for item in compareList:
            testList.append("")
            testDic[item]=lineList[iter]
            iter+=1
        iter=0

    for item in compareList:
        testFile.write(testDic[item])
        if compareList.index(item)!=len(compareList)-1
            testFile.write(",")
    testFile.write('\n')
testFile.close()
z.close()

所以这应该做的是检查并确保csv文件的每一行都与列表的长度相匹配。如果行的长度与列表的长度不匹配,则该行将转换为等于compareList长度的空值(逗号)。 以下是文件中的内容示例:

,,"sometext",343434
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

如果该行缺少一个项目,代码就可以正常工作。所以at文件的输出包含:

,"sometext",343434
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

将如下所示:

,,,,
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

我所引发的问题是这条线看起来像这样:

,"sometext",343434
,,"moretext",343434
,,"St,'",uff",4543343
,,"morestuff",3434354

此文件的输出将为:

,,,,
,,"moretext",343434
,,,,

因此它将按预期应用更改并将第1行和第3行归零,但它只是停止在该行处理。我一直在试着弄清楚这里发生了什么,没有运气。

一如既往,我非常感谢您愿意给予的任何帮助。

1 个答案:

答案 0 :(得分:1)

只需打印csv.reader返回的每一行,以了解问题所在:

>>> import csv
>>> z=open("testFile",'r')
>>> x=csv.reader(z,quotechar='\'')
>>> for lineList in x:
...     print lineList
...
['', '"sometext"', '343434']
['', '', '"moretext"', '343434']
['', '', '"St', '",uff",4543343\n,,"morestuff",3434354\n']

最后两行只是csv.reader的一行。 现在,只需删除quotechar='\''

即可
>>> import csv
>>> z=open("testFile",'r')
>>> x=csv.reader(z)
>>> for lineList in x:
...     print lineList
...
['', 'sometext', '343434']
['', '', 'moretext', '343434']
['', '', "St,'", 'uff"', '4543343']
['', '', 'morestuff', '3434354']