替换列表中的一个单词,然后追加到同一列表中

时间:2018-08-22 07:23:04

标签: python list

我的列表:

city=['Venango Municiplaity', 'Waterford ship','New York']

预期结果:

city = ['Venango Municiplaity ', 'Waterford ship','New York','Venango','Waterford']

常用词:

common_words = ['ship','municipality']

扫描“我的列表”中的所有项目,并去除常用词,然后将其重新插入到“预期结果”所示的同一列表中。

我能够搜索包含常用词的项目,但不确定如何将其替换为空白并重新插入“我的列表”中。

到目前为止,我的代码:

for item in city:
    if(any(x in s.lower() for s in item.split(' ') for x in common_words)) :

8 个答案:

答案 0 :(得分:8)

我编写了一个可以按预期工作的小代码:

city=['Venango Municiplaity', 'Waterford ship','New York']
comwo = ['ship','municipality']
for i, c in enumerate(city):
    for ii in comwo:
        if ii in c:
            city.append(city[i].replace(ii,""))
print(city)

输出:

['Venango Municiplaity', 'Waterford ship', 'New York', 'Waterford ']

注意:

您创建的列表包含不正确的拼写。
查看列表city的第一个元素Venango Municiplaity 和common_words的第二个元素 municipality

编辑:

因此,如果您也想替换单词后面的空格(如果有的话),那么我做了一个单独的代码:

city=['Village home', 'Villagehome','New York']
comwo = ['home']
for i, c in enumerate(city):
    for ii in comwo:
        if ii in c:
            city.append(city[i].replace(" "+ii,"")) if city[i].replace(" "+ii,"") != city[i] else city.append(city[i].replace(ii,""))
print(city)

输出:

['Village home', 'Villagehome', 'New York', 'Village', 'Village']

答案 1 :(得分:7)

我建议您采用以下解决方案,将re.subflags=re.IGNORECASE结合使用,以除去忽略大小写的常见单词:

import re

city = ['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']

toAppend = []

for c in city:
    for cw in common_words:
        if cw.lower() in c.lower().split():
            toAppend.append(re.sub(cw, "", c, flags=re.IGNORECASE).strip())

city += toAppend

print(city) # ['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

这是使用列表理解的单线样式解决方案,虽然简短但可读性却很差:

import re

city = ['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']

city += [re.sub(cw, "", c, flags=re.IGNORECASE).strip() for c in city for cw in common_words if cw.lower() in c.lower().split()]

print(city) # ['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

答案 2 :(得分:6)

您可以尝试一下,创建新列表以保存数据,然后应将数据添加到原始列表中,然后合并结果:

In [1]: city=['Venango Municiplaity', 'Waterford ship','New York']

In [2]: common_words = ['ship', 'municiplaity']

In [3]: list_add = []

In [4]: for item in city:
   ...:     item_words = [s.lower() for s in item.split(' ')]
   ...:     if set(common_words) & set(item_words):
   ...:         new_item = [s for s in item.split(' ') if s.lower() not in common_words]
   ...:         list_add.append(" ".join(new_item))
   ...:         

In [5]: city + list_add
Out[5]: ['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']

答案 3 :(得分:4)

这是使用正则表达式的一种方法。

演示:

import re

city=['Venango Municiplaity', 'Waterford ship','New York']
common_words = ['ship','municiplaity']
common_words = "(" + "|".join(common_words) + ")"

res = []
for i in city:
    if re.search(common_words, i, flags=re.IGNORECASE):
        res.append(i.strip().split()[0])
print(city + res)

输出:

['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']

答案 4 :(得分:4)

您可以使用列表理解来检测某项是否包含要添加到city列表中的内容。

city=['Venango Municipality', 'Waterford ship','New York']

common_words = ['ship','municipality']
items_to_add = []
for item in city: 
  toAddition = [word for word in item.split() if word.lower() not in common_words]
  if ' '.join(toAddition) != item:
    items_to_add.append(' '.join(toAddition))

print(city + items_to_add)  

输出

['Venango municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

答案 5 :(得分:4)

将结果放入单独的列表中,然后使用list.extend()将结果列表的内容附加到原始列表中

cities = ['Venango Municipality', 'Waterford ship', 'New York']

common_words = ['ship', 'municipality']

add_list = []

for city in cities:
    rl = []
    triggered = False
    for city_word in city.split():
        if city_word.lower() in common_words:
            triggered = True
        else:
            rl.append(city_word)
    if triggered:
        add_list.append(' '.join(rl))

cities.extend(add_list)
print(cities)

答案 6 :(得分:0)

带有re模块的方法:

import re

city=['Venango Municipality', 'Waterford ship','New York']
common_words = ['ship','municipality']
print(city)

for item in city:
    word_list = str(item).split(" ")
    for word in word_list:
        if word.lower() in common_words:
            word_list.remove(word)
            city.extend(word_list)
            continue

print(city)

输出:

['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']

答案 7 :(得分:0)

尝试使用extend

city.extend([i.split()[0] for i in city if i.split()[1].lower() in map(str.lower,common_words)])

演示:

>>> city=['Venango Municipality', 'Waterford ship','New York']
>>> common_words = ['ship','municipality']
>>> city.extend([i.split()[0] for i in city if i.split()[1].lower() in map(str.lower,common_words)])
>>> city
['Venango Municipality', 'Waterford ship', 'New York', 'Venango', 'Waterford']
>>> 

如果要拼错:

>>> city=['Venango Municiplaity', 'Waterford ship','New York']
>>> common_words = ['ship','municipality']
>>> from difflib import SequenceMatcher
>>> city.extend([i.split()[0] for i in city if any(SequenceMatcher(None,i.split()[1].lower(),v).ratio()>0.8 for v in map(str.lower,common_words))])
>>> city
['Venango Municiplaity', 'Waterford ship', 'New York', 'Venango', 'Waterford']
>>>