Python:如何解析包含NULL值的CSV文件?

时间:2017-04-06 19:57:12

标签: python csv null

我有一个包含二进制字段的csv文件,当我按csv.reader(f)阅读时,我得到了

  

包含NULL值。

我已在网络上尝试了各种解决方案,例如thisthisthis,但仍然会弹出相同的错误。我设法逐行阅读并将其分开,,但有些字段中还有,,所以我想知道如何读取和提取列?行的示例如下:

212344408,"cp233.net","net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N

我很感激任何建议。

2 个答案:

答案 0 :(得分:2)

Pandas对我的案例很有用,可以检索文件并跳过那些由于奇怪字符而被破坏的行。

import pandas as pd

df = pandas.read_csv(filename, verbose =True , warn_bad_lines = True, error_bad_lines=False, names = header)

答案 1 :(得分:0)

这对你的例子很好用,我甚至用NULL替换了一个字符串,它处理得很好。

test.csv:

212344408,"cp233.net","net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N
212344408,NULL,"net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N

代码:

import csv
with open('test.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

如果这不是您遇到的行为,您可以提供失败的行吗?