编辑：

Question

我的列表：

city=['Venango Municiplaity', 'Waterford ship','New York']

预期结果：

city = ['Venango Municiplaity ', 'Waterford ship','New York','Venango','Waterford']

常用词：

common_words = ['ship','municipality']

扫描“我的列表”中的所有项目，并去除常用词，然后将其重新插入到“预期结果”所示的同一列表中。

我能够搜索包含常用词的项目，但不确定如何将其替换为空白并重新插入“我的列表”中。

到目前为止，我的代码：

for item in city:
    if(any(x in s.lower() for s in item.split(' ') for x in common_words)) :

Answer 1

我编写了一个可以按预期工作的小代码：

city=['Venango Municiplaity', 'Waterford ship','New York']
comwo = ['ship','municipality']
for i, c in enumerate(city):
    for ii in comwo:
        if ii in c:
            city.append(city[i].replace(ii,""))
print(city)

输出：

['Venango Municiplaity', 'Waterford ship', 'New York', 'Waterford ']

注意：

您创建的列表包含不正确的拼写。
查看列表city的第一个元素Venango Municiplaity 和common_words的第二个元素 municipality

编辑：

因此，如果您也想替换单词后面的空格（如果有的话），那么我做了一个单独的代码：

city=['Village home', 'Villagehome','New York']
comwo = ['home']
for i, c in enumerate(city):
    for ii in comwo:
        if ii in c:
            city.append(city[i].replace(" "+ii,"")) if city[i].replace(" "+ii,"") != city[i] else city.append(city[i].replace(ii,""))
print(city)

输出：

['Village home', 'Villagehome', 'New York', 'Village', 'Village']

Answer 2

我建议您采用以下解决方案，将re.sub与flags=re.IGNORECASE结合使用，以除去忽略大小写的常见单词：

import re

city = ['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']

toAppend = []

for c in city:
    for cw in common_words:
        if cw.lower() in c.lower().split():
            toAppend.append(re.sub(cw, "", c, flags=re.IGNORECASE).strip())

city += toAppend

print(city) # ['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

这是使用列表理解的单线样式解决方案，虽然简短但可读性却很差：

import re

city = ['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']

city += [re.sub(cw, "", c, flags=re.IGNORECASE).strip() for c in city for cw in common_words if cw.lower() in c.lower().split()]

print(city) # ['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

Answer 3

您可以尝试一下，创建新列表以保存数据，然后应将数据添加到原始列表中，然后合并结果：

In [1]: city=['Venango Municiplaity', 'Waterford ship','New York']

In [2]: common_words = ['ship', 'municiplaity']

In [3]: list_add = []

In [4]: for item in city:
   ...:     item_words = [s.lower() for s in item.split(' ')]
   ...:     if set(common_words) & set(item_words):
   ...:         new_item = [s for s in item.split(' ') if s.lower() not in common_words]
   ...:         list_add.append(" ".join(new_item))
   ...:         

In [5]: city + list_add
Out[5]: ['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']

Answer 4

这是使用正则表达式的一种方法。

演示：

import re

city=['Venango Municiplaity', 'Waterford ship','New York']
common_words = ['ship','municiplaity']
common_words = "(" + "|".join(common_words) + ")"

res = []
for i in city:
    if re.search(common_words, i, flags=re.IGNORECASE):
        res.append(i.strip().split()[0])
print(city + res)

输出：

['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']

Answer 5

您可以使用列表理解来检测某项是否包含要添加到city列表中的内容。

city=['Venango Municipality', 'Waterford ship','New York']

common_words = ['ship','municipality']
items_to_add = []
for item in city: 
  toAddition = [word for word in item.split() if word.lower() not in common_words]
  if ' '.join(toAddition) != item:
    items_to_add.append(' '.join(toAddition))

print(city + items_to_add)

输出

['Venango municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

Answer 6

将结果放入单独的列表中，然后使用list.extend()将结果列表的内容附加到原始列表中

cities = ['Venango Municipality', 'Waterford ship', 'New York']

common_words = ['ship', 'municipality']

add_list = []

for city in cities:
    rl = []
    triggered = False
    for city_word in city.split():
        if city_word.lower() in common_words:
            triggered = True
        else:
            rl.append(city_word)
    if triggered:
        add_list.append(' '.join(rl))

cities.extend(add_list)
print(cities)

Answer 7

带有re模块的方法：

import re

city=['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']
print(city)

for item in city:
    word_list = str(item).split(" ")
    for word in word_list:
        if word.lower() in common_words:
            word_list.remove(word)
            city.extend(word_list)
            continue

print(city)

输出：

['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

Answer 8

尝试使用extend：

city.extend([i.split()[0] for i in city if i.split()[1].lower() in map(str.lower,common_words)])

演示：

>>> city=['Venango Municipality', 'Waterford ship','New York']
>>> common_words = ['ship','municipality']
>>> city.extend([i.split()[0] for i in city if i.split()[1].lower() in map(str.lower,common_words)])
>>> city
['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']
>>>

如果要拼错：

>>> city=['Venango Municiplaity', 'Waterford ship','New York']
>>> common_words = ['ship','municipality']
>>> from difflib import SequenceMatcher
>>> city.extend([i.split()[0] for i in city if any(SequenceMatcher(None,i.split()[1].lower(),v).ratio()>0.8 for v in map(str.lower,common_words))])
>>> city
['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']
>>>

替换列表中的一个单词，然后追加到同一列表中

8 个答案:

注意：

编辑：