在csv文件中,我尝试用其他字符替换某些字符。
我目前的代码与此类似:
import csv
set1 = set('abc')
set2 = set('def')
set3 = set('ghi')
with open(path, 'r') as input, open(path2, 'w') as output:
reader = csv.reader(input)
writer = csv.writer(output)
for row in reader:
newrow = row
newrow = [''.join('x' if c in set1 else c for c in item) for item in newrow]
newrow = [''.join('y' if c in set2 else c for c in item) for item in newrow]
newrow = [''.join('z' if c in set3 else c for c in item) for item in newrow]
writer.writerow(newrow)
在这个例子中,我只使用了三个生成器表达式,但它可能很容易超过它。
有谁知道正确的方法吗?我担心的是这种结构可能不是最快的(当然看起来并不是最优的)。
答案 0 :(得分:2)
您可以使用循环并参数化不同的部分:
newrow = row
for v, s in (('x', set1), ('y', set2), ('z', set3)):
newrow = [''.join(v if c in s else c for c in item) for item in newrow]
如果要替换字符,请不要使用集合,而是使用映射:
mapping = dict.fromkeys(set1, 'x')
mapping.update(dict.fromkeys(set2, 'y'))
mapping.update(dict.fromkeys(set3, 'z'))
for row in reader:
newrow = [''.join(mapping.get(c, c) for c in item) for item in newrow]
答案 1 :(得分:2)
str.translate
可能是合适的;
replacements = [
('abc', 'x'),
('def', 'y'),
('ghi', 'z'),
]
trans = str.maketrans({ k: v for l, v in replacements for k in l })
和
new_row = [item.translate(trans) for item in row]
答案 2 :(得分:1)
这是以某种方式结合两种答案并且效果很好的东西(比问题中的代码快多倍):
replacements = [
('abc', 'x'),
('def', 'y'),
('ghi', 'z'),
]
mapping = {a: b for c, b in replacements for a in c}
for row in reader:
newrow = [''.join(mapping.get(c, c) for c in item) for item in row]
writer.writerow(newrow)