Question

考虑以下CSV：

"""tom"""
""fred""
"henry"
Jack
"""mary"""

下面查找我定义的一些字符，删除它们，然后在每行（行）的末尾添加一个字符串。它“有效”，但我不确定我是否以正确的方式进行....在我看来，应该打开，编辑和保存原始文件。我会针对成千上万的CSV文件运行它，因此它会变得非常混乱。

import csv
s = open('Book1.csv','r').read()
chars = ('$','%','^','*','"','_') # etc
for c in chars:
  s = ''.join( s.split(c) )
out_file = open('Book2.csv','w')
out_file.write(s)
out_file.close()
output = ""
file_name = 'Book2.csv'
string_to_add = "@bigfoot.com"
with open(file_name, 'r') as f:
    file_lines = [''.join([x.strip(), string_to_add, '\n']) for x in f.readlines()]
with open(file_name, 'w') as f:
    f.writelines(file_lines)


tom@bigfoot.com
fred@bigfoot.com
henry@bigfoot.com
Jack@bigfoot.com
mary@bigfoot.com

Answer 1

您只需打开一次文件即可阅读，一次写入，您不需要使用两个单独的文件。您执行的文件读写越少，脚本运行的速度就越快。

一些附带点：

始终使用with open(...) as f
更可读的替换字符的方法是使用str.replace()。
您可以查看str.splitlines()

另外，从这个例子看，你的代码中实际上并没有使用csv模块。

以下是我的建议：

chars = ('$', '%', '^', '*', '"', '_')
string_to_add = '@bigfoot.com'

with open('tmp', 'r') as f:
    s = f.read()

# Replace unwanted characters
for c in chars:
    s = s.replace(c, '')

# Append line ending
s = '\n'.join(line + string_to_add for line in s.splitlines())

with open('tmp', 'w') as f:
    f.write(s)

Answer 2

你过度复杂了。

首先，读取行，在行上应用strip以删除字符串开头或结尾处的所有字符（包括换行符或不起作用）。使用replace的循环非常低效且不必要，因为strip完全符合您的要求。

然后，将这些行写回同一个文件，附加域＆amp;换行

input_file = 'Book1.csv'
chars = '$%^*"_\n'  # etc notice the \n (linefeed)
with open(input_file) as f:
    lines = [x.strip(chars) for x in f]
with open(input_file,"w") as f:
    f.writelines("{}@bigfoot.com\n".format(x) for x in lines)

打开CSV，替换文本并逐行添加新字符串并保存到原始文件

2 个答案: