根据列表替换文本

时间:2013-10-23 07:35:52

标签: python text replace

我在文件中有文字,如下:

  

美联储可能要等到2014年初开始放松的原因之一   回到刺激计划的是那里的政策制定者根本就不会   知道劳动力市场在此之前是在增强还是失去力量。   直到十二月,每月就业调查才能免除   关闭静态,该报告直到早期才出现   一月。

     随着经济增长,9月就业报告令人失望   148,000个新工作岗位,而不是预期的185,000个,但库存增加   预计美联储的刺激措施将持续到2014年。

在另一个文件中我有替换列表:

  

1月:Febryary   九月:十一月   每月:每周

如何根据替换列表更换文本中的所有单词?

Try this:
with open('t_.txt') as f3:
    with open ('egb.out') as w3:

        for line in f3:
            for line1 in w3:

                word,string = line1.split(':')
                print line.replace(word,string),

但仅适用于第一行

2 个答案:

答案 0 :(得分:2)

在将这两个文件读入字符串之后,这些行中的某些内容应该可以正常工作

# text contains the first file
# replacements contains the list of replacement
for w in replacements.split(' '):
    if ':' in w:
        word,replacement = w.split(':')
        text = text.replace(word,replacement)

答案 1 :(得分:2)

使用字典,以及类似此字符串的内容(或从文件中读取或其他内容):

rep = {'January':'Febryary', 'September':'november', 'monthly':'weekly'}

s = """One reason the Fed is likely to wait until early 2014 to begin easing back on stimulus efforts is that policy makers there simply will not know if the labor market is gaining or losing strength before then. Not until December will the monthly jobs survey be free of the shutdown static, and that report does not come out until early January.

The September jobs report was disappointing, with the economy adding 148,000 new jobs instead of the expected 185,000, but stocks rose on anticipation that Fed stimulus efforts would continue well into 2014."""

然后你可以使用这个单行:

result = reduce(lambda x, y: x.replace(*y), rep.iteritems(), s)

或使用(在我看来更有效率)正则表达式:

import re

rep = dict((re.escape(k), v) for k, v in rep.iteritems()) # makes sure things wont screw up
pattern = re.compile("|".join(rep.keys())) # create the pattern
result = pattern.sub(lambda m: rep[re.escape(m.group(0))], s)

但实际上,如果你正在处理这类事情,你应该看看nltk (Natural Language Toolkit)