如何在Python中从CSV文件中删除空的双引号?

时间:2019-03-29 04:02:30

标签: python

如何使用Python从CSV文件中删除空的双引号?

这是文件的当前外观:

"text","more text","","other text","","text"

这是我想要的样子:

"text","more text",,"other text",,"text"

3 个答案:

答案 0 :(得分:0)

我认为最好的解决方案是使用csv.reader中的quotechar选项,然后过滤空白字段:

import csv

with open('test.csv', newline='') as csvf:
    for row in csv.reader(csvf, delimiter=',', quotechar='"'):
        row = filter(lambda v: v, row)
        # Now row is just an iterator containing non-empty strings
        # You can use it as you please, for example: 
        print(', '.join(row))

如果不是要删除空字段,您需要将它们替换为给定值(例如None):

import csv

def read(file, placeholder=None):
    with open(file, newline='') as csvf:
        for row in csv.reader(csvf, delimiter=',', quotechar='"'):
            yield [v if v else placeholder for v in row]

for row in read('test.csv'):
    pass # Do something with row

例如,如果您需要使用双引号将其打印到标准输出(这是一个愚蠢的示例):

for row in read('test.csv'):
    print(', '.join(f'"{v}"' if v else '' for v in row))

答案 1 :(得分:0)

您可以尝试:

>>> s=""""text","more text","","other text","","text" """
>>> s
'"text","more text","","other text","","text" '
>>> s.replace('""','')
'"text","more text",,"other text",,"text" '

答案 2 :(得分:0)

lambda函数和一些熊猫魔术的组合将大大提高速度,一旦加载了DataFrame,您将获得类似的东西

Before Processing

然后,您只需要编写一个lambda函数

replacer = lambda x: x.replace('""','')
df = df.apply(replacer)

您正在寻找哪个操作并给您 After Applying replacer

然后只需使用df.to_csv(filepathAsStr)将更改保存到磁盘上或继续进行所需的操作,df.apply()即可跨数据帧并行化,因此与简单的str.replace或任何其他dotnet new angular -o myproj使用串行计算的方法。