使用dic替换文件中的文本

时间:2013-06-24 14:53:25

标签: python

“noslang.txt”中有5404个。实施例

...
2mz   tomorrow
2night   tonight
2nite   tonight
soml   story of my life
ssry   so sorry
...

在“test.txt”中

ya right
i'll attend the class
2morow will b great 

我的代码:

 NoSlang = open("noslang.txt")
 for line in NoSlang:
      slang,fulltext = map(str, line.split('\t'))
      dic[slang] = fulltext.strip('\n')


 file = open('test.txt').read().split("\n")
 for line in file:
     sline = line.split(" ")
     for n,i in enumerate(sline):
         if i in dic:
             sline[n] = dic[i]
     print ' '.join(sline)

我尝试创建字典并将其替换为“test.txt”中的句子。 结果显示相同,没有任何变化。

有什么建议吗?

预期结果:

 yeah  right
 i'll attend the class
 tomorrow will be great

2 个答案:

答案 0 :(得分:1)

您可以使用正则表达式替换文件中的单词:

#!/usr/bin/env python
import re
from functools import partial

with open('noslang.txt') as file:
    # slang word -> translation
    slang_map = dict(map(str.strip, line.partition('\t')[::2])
                     for line in file if line.strip())

slang_words = sorted(slang_map, key=len, reverse=True) # longest first for regex
regex = re.compile(r"\b({})\b".format("|".join(map(re.escape, slang_words))))
substitute_slang = partial(regex.sub, lambda m: slang_map[m.group(1)])

with open('input.txt') as file:
    for line in file:
        print substitute_slang(line),

如果input.txt不是很大,您可以立即替换所有俚语:

with open('input.txt') as file:
    print substitute_slang(file.read()),

答案 1 :(得分:0)

这样的事情:

with open('noslang.txt') as f:
    dic = dict(line.strip().split(None,1) for line in f)
...     
with open('test.txt') as f:
    for line in f:                                             
        spl = line.split()
        new_lis =[dic.get(word,word) for word in spl]
        print " ".join(new_lis)
...         
yeah right
i'll attend the class
tomorrow will b great

noslang.txt包含:

ya   yeah
2morow   tomorrow 
2mz   tomorrow
2night   tonight
2nite   tonight
2nyt   tonight