用CSV替换整个字符串

时间:2013-08-30 21:38:57

标签: python string csv

当我运行此代码来编辑我的CSV文件时,即使我的字典中有字符串,也只会替换部分字符串。

import re

def replace_all(text, dic):
    for i, j in dic.iteritems():
        text = text.replace(i, j)
    return text

bottle = "vial jug canteen urn jug33"
transport = "car automobile airplane scooter"

mydict = {}
for word in bottle.split():
    mydict[word] = 'bottle'
for word in transport.split():
    mydict[word] = 'transport'
print(mydict) # test


with open('replacesample.csv','r') as f:
    text=f.read()
    text=replace_all(text,mydict)
    text=re.sub(r'PROD\s(?=[1-9])',r'PROD',text)

with open('file2.csv','w') as w:
    w.write(text)

例如,如果我的strting CSV是这样的:

jug 
canteen 
urn
car
automobile
swag
airplane
jug33

我的最终结局是:

bottle 
bottle 
bottle
transport
transport
swag
transport
bottle33

我该如何解决这个问题?

预期:

bottle 
bottle 
bottle
transport
transport
swag
transport
bottle

1 个答案:

答案 0 :(得分:0)

您正在使用字典来枚举替换模式。字典以任意顺序返回键和值。

因此,jug - > bottle替换发生在之前 jug33 - > bottle替换。此替换也适用于部分单词。

解决方案是按照长度的相反顺序对键进行排序,以确保首先替换较长的匹配:

def replace_all(text, dic):
    for i, j in sorted(dic.iteritems(), key=lambda i: len(i[0]), reverse=True):
        text = text.replace(i, j)
    return text

演示:

>>> def replace_all(text, dic):
...     for i, j in dic.iteritems():
...         text = text.replace(i, j)
...     return text
... 
>>> replace_all('jug33 jug', mydict)
'bottle33 bottle'
>>> def replace_all(text, dic):
...     for i, j in sorted(dic.iteritems(), key=lambda i: len(i[0]), reverse=True):
...         text = text.replace(i, j)
...     return text
... 
>>> replace_all('jug33 jug', mydict)
'bottle bottle'