Question

我需要用不同的单词替换长度为4的文本文档中的所有单词。

例如，如果一个文本文件中包含“我喜欢吃非常热的汤”的短语，那么“喜欢”，“非常”和“汤”这两个词将被“某物”取代

然后，它不需要覆盖原始文本文档，而是需要使用更改后的短语创建一个新文本。

这是我到目前为止所做的：

def replacement():  
    o = open("file.txt","a") #file.txt will be the file containing the changed phrase
    for line in open("y.txt"):  #y.txt is the original file
        line = line.replace("????","something")  #see below
        o.write(line + "\n")
    o.close()

我试过改变“????”像

这样的东西

(str(len(line) == 4)

但这不起作用

Answer 1

这将保留您拥有的额外空间，因为使用str.split()的其他解决方案不会。

import re

exp = re.compile(r'\b(\w{4})\b')
replaceWord = 'stuff'
with open('infile.txt','r') as inF, open('outfile.txt','w') as outF:
    for line in inF:
        outF.write(exp.sub(replaceWord,line))

这使用正则表达式替换文本。这里使用的正则表达式有三个主要部分。第一个匹配单词的开头：

\b

第二部分恰好匹配四个字母（所有字母数字字符和_）：

(\w{4})

最后一部分与第一部分相似，它匹配单词的结尾

\b

Answer 2

这似乎是家庭作业，所以这里有一些关键概念。

当您阅读文件时，lines为strings。您可以使用名为line的字符串方法将list拆分为.split()，就像这样。 words = line.split()。这会创建一个单词列表。

现在，list是 iterable ，这意味着您可以在其上使用for循环，并且一次对list的一个项目执行操作。您想检查word的长度，因此您必须使用循环迭代words，并对其执行某些操作。您已经接近于弄清楚如何使用len(word)检查单词的长度。

您还需要一个可以存储最终信息的地方。在循环之外，您需要为list获得结果，并.append()在您进行检查时检查的单词。

最后，您需要为文件中的每个line执行此操作，这意味着迭代文件的 second for循环。

Answer 3

with open('file.txt', 'a') as write_file:
    with open('y.txt') as read_file:
        for line in read_file.readlines():
            # Replace the needed words
            line = line.replace('????', 'something')
            write_file.write(line)

Answer 4

首先让一个函数返回something，如果给出一个长度为4的单词，则给出一个单词：

def maybe_replace(word, length=4):
  if len(word) == length:
    return 'something'
  else:
    return word

现在让我们来看看你的for循环。在每次迭代中，您都有一行原始文件。让我们分成单词。 Python为我们提供了split函数，我们可以使用它：

   split_line = line.split()

默认是在空格上拆分，这正是我们想要的。如果你需要，可以more documentation。

现在我们想获得在每个单词上调用maybe_replace函数的列表：

  new_split_line = [maybe_replace(word) for word in split_line]

现在我们可以使用join method：

将这些联接起来

  new_line = ' '.join(new_split_line)

并将其写回我们的文件：

  o.write(new_line + '\n')

所以我们的最终功能将是：

def replacement():  
  o = open("file.txt","a") #file.txt will be the file containing the changed phrase
  for line in open("y.txt"):  #y.txt is the original file
    split_line = line.split()
    new_split_line = [maybe_replace(word) for word in split_line]
    new_line = ' '.join(new_split_line)
    o.write(new_line + '\n')
  o.close()

Python 3.2替换文本文档中具有一定长度的所有单词？

4 个答案: