Question

这是我在陷入困境之前想出的（NB文字来源：《经济学人》）：

import random
import re

text = 'One calculation by a film consultant implies that half of Hollywood productions with budgets over one hundred million dollars lose money.'

nbofwords = len(text.split())

words = text.split()

randomword = random.choice(words)
randomwordstr = str(randomword)

第1步有效：从原始文本中删除随机单词

replaced1 = re.sub(randomwordstr, '', text)
replaced2 = re.sub('  ', ' ', replaced1)

第2步有效：选择定义数量的随机词

nbofsamples = 3
randomitems = random.choices(population=words, k=nbofsamples)

给予，例如['over'，'consultant'，'One']

第3步有效：借助其索引，从原始文本中删除该随机单词列表中的一个元素

replaced3 = re.sub(randomitems[1], '', text)
replaced4 = re.sub('  ', ' ', replaced3)

删除“顾问”一词

第4步失败：由于其索引，从原始文本中删除了该随机词列表中的所有元素我能找出的最好的是：

replaced5 = re.sub(randomitems[0],'',text)
replaced6 = re.sub(randomitems[1],'',replaced5)
replaced7 = re.sub(randomitems[2],'',replaced6)
replaced8 = re.sub('  ', ' ', replaced7)
print(replaced8)

它可以工作（所有3个单词都已删除），但是它笨拙且效率低下（如果我更改nbofsamples变量，则必须重写它）。

如何从我的随机单词列表中进行迭代（第2步）以删除原始文本中的那些单词？

预先感谢

Answer 1

要从字符串中删除列表中的单词，只需使用for循环。这将遍历列表中的每个项目，将列表中项目的值分配给您想要的任何变量（在这种情况下，我使用“ i”，但是我几乎可以使用任何普通变量）并执行直到循环中没有更多项目为止。这是for循环的基本内容：

list = []
for i in list:
    print(i)

在您的情况下，您想从字符串中删除列表中指定的单词，因此只需将变量“ i”插入到您用来删除单词的相同方法中即可。之后，您需要一个不断变化的变量，否则循环将只从字符串中删除列表中的最后一个单词。之后，您可以打印输出。该代码将列出和的长度。

r=replaced3
for i in randomitems:
    replaced4 = re.sub(i, '', r)
    r=replaced4
print(replaced4)

Answer 2

请注意，只要您不使用任何正则表达式，而仅用其他字符串（或不使用其他字符串）替换简单的字符串，就不需要re：

for r in randomitems:
    text = text.replace(r, '')
print(text)

对于仅替换第一次出现的事件，您可以在替换功能中简单地设置所需的出现次数：

text = text.replace(r, '', 1)

在Python中，如何根据列表删除字符串中的某些单词？

2 个答案: