我有一个包含多行文字的txt文件(myText.txt)。
我想知道:
例如,如果myText.txt是:
The ancient Romans influenced countries and civilizations in the following centuries.
Their language, Latin, became the basis for many other European languages. They stayed in Roma for 3 month.
答案 0 :(得分:3)
你总是可以使用正则表达式:
import re
st='''\
The ancient Romans influenced countries and civilizations in the following centuries.
Their language, Latin, became the basis for many other European languages. They stayed in Roma for 3 month.'''
deletions=('and','in','the')
repl={"ancient": "old", "month":"years", "centuries":"years"}
tgt='|'.join(r'\b{}\b'.format(e) for e in deletions)
st=re.sub(tgt,'',st)
for word in repl:
tgt=r'\b{}\b'.format(word)
st=re.sub(tgt,repl[word],st)
print st
答案 1 :(得分:2)
这应该可以解决问题。使用列表存储要删除的对象,然后遍历列表并从内容字符串中删除列表中的每个元素。然后,您使用字典存储您现在拥有的单词以及要替换它们的单词。你也可以遍历这些并用替换词替换当前的词。
def replace():
contents = ""
deleteWords = ["the ", "and ", "in "]
replaceWords = {"ancient": "old", "month":"years", "centuries":"years"}
with open("meText.txt") as f:
contents = f.read()
for word in deleteWords:
contents = contents.replace(word,"")
for key, value in replaceWords.iteritems():
contents = contents.replace(key, value)
return contents
答案 2 :(得分:2)
使用列表进行删除,使用字典进行替换。看起来应该是这样的:
def processTextFile(filename_in, filename_out, delWords, repWords):
with open(filename_in, "r") as sourcefile:
for line in sourcefile:
for item in delWords:
line = line.replace(item, "")
for key,value in repWords.items():
line = line.replace(key,value)
with open(filename_out, "a") as outfile:
outfile.write(line)
if __name__ == "__main__":
delWords = []
repWords = {}
delWords.extend(["the ", "and ", "in "])
repWords["ancient"] = "old"
repWords["month"] = "years"
repWords["centuries"] = "years"
processTextFile("myText.txt", "myOutText.txt", delWords, repWords)
请注意,这是为Python 3.3.2编写的,这就是我使用items()的原因。如果使用Python 2.x,请使用iteritems(),因为我认为它对于大型文本文件更有效。