我的csv文件(" challenge.csv")包含多行,如下所示(列数不同,大约8000行):
2937,58462bc9a559fa7d29819028,29,57eb63d813fd7c0329bdb01f,
2938,58462bc9a559fa7d29819028,30,57eb63d713fd7c0329bdafb5,57eb63d713fd7c0329bdafb6
我也有
来自" forDic.csv" 的名为mydic
的词典,例如:
{ '58462bc9a559fa7d29819028':'negative chin up', '57eb63d813fd7c0329bdb01f':'knee squeeze squat', '57eb63d713fd7c0329bdafb5: 'squat', '57eb63d713fd7c0329bdafb6':'lunge', ... }
我想更改"challenges.csv"
的值,其值为mydic
如果"challenges.csv"
的值等于mydic
的键。
我能怎么做?请帮帮我。
预期输出 一个csv文件,其中包含如下所示的行
2937,'负面下巴' ,29,'膝盖挤压下蹲' ,
2938,'负面下巴' ,30,'蹲下' '冲刺'
import csv
with open('./forDic.csv', mode='r')as infile:
reader = csv.reader(infile)
mydic = dict((rows[0], rows[1]) for rows in reader)
print(mydic)
def replace_all()
with open('./challenges.csv', mode='r')as infile, open('./challenges_new.csv', mode='w') as outfile:
r = csv.reader(infile)
w = csv.writer(outfile)
for row in r:
for k in iter(mydic.keys()):
print(', '.join(row))
rl = [w.replace(str(k), str(mydic.values())) for w in rl]
print(rl[0])
row_list_string = ' / '.join(map(str, rl))
for k in list(mydic.keys()):
k = k.replace(k, mydic.get(k))
print(k)
replace_all()
答案 0 :(得分:0)
假设
317, change1, 89, change2, change3
318, change1, 89, change3, change4
change1, changedto1
change2, changedto2
change3, changedto3
change4, changedto4
以下代码只打印替换行
import re, csv
with open('fordic.csv', mode='r')as infile:
reader = csv.reader(infile)
mydic = dict((rows[0], rows[1]) for rows in reader)
print(mydic)
mydic = dict((re.escape(k), v) for k, v in mydic.iteritems())
pattern = re.compile("|".join(mydic.keys()))
with open('./challenges.csv', mode='r') as infile:
lines = infile.readlines()
for row in lines:
print pattern.sub(lambda m: mydic[re.escape(m.group(0))], row)
317, changedto1, 89, changedto2, changedto3
318, changedto1, 89, changedto3, changedto4
了解多字符串替换遵循此SO Answer
答案 1 :(得分:0)
最好不要尝试更新适当的值,而是创建一个新的临时文件作为输出。此脚本尝试对所有列值进行字典替换,并将每一行写回新的临时文件。通过使用这种方法,文件可以是任何大小,而无需完全加载到内存中:
以下方法应该有效:
import csv
import os
challenges = 'challenges.csv'
temp = '_temp.csv'
with open('forDic.csv', newline='') as f_fordic:
mydic = {row[0] : row[1] for row in csv.reader(f_fordic)}
with open(challenges, newline='') as f_challenges, open(temp, 'w', newline='') as f_temp:
csv_temp = csv.writer(f_temp)
for row in csv.reader(f_challenges):
csv_temp.writerow([mydic.get(c.strip(), c.strip()) for c in row])
# Rename the temp file back to challenges (optional)
os.remove(challenges)
os.rename(temp, challenges)
为您提供更新的challenges.csv
文件,如下所示:
2937,negative chin up,29,knee squeeze squat,
2938,negative chin up,30,squat,lunge