Python从字符串中删除列表中的字符串

时间:2014-11-15 09:50:59

标签: python

我有一个大字符串和一个停止词的大列表。我在下面创建了一个小例子。

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]

正如你可以想象的那样,我希望成员们能够摆脱困境。 我试过这个。

for word in stop:
    s = s.replace(word,"")

我收到此错误。

AttributeError:'list'对象没有属性'replace'

4 个答案:

答案 0 :(得分:0)

您需要执行以下操作。按s拆分为单词列表。然后从停用词列表中创建一个哈希值。然后遍历列表,如果值不是哈希值 - 请保留它。

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
arr = s.split(' ')
h = {i: 1 for i in stop}

result = []
for i in arr:
    if i not in h:
        result.append(i)

print ' '.join(result)

答案 1 :(得分:0)

当你编写s.replace()时,

是一个列表,所以你可能对s进行了更改,现在它是一个列表而不是一个字符串

此代码效果很好:

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
for word in stop:
    s = s.replace(word,"")

尝试找到修改s的位置,在代码中的某处搜索任务

答案 2 :(得分:0)

演示here

最优雅的方式是使用set difference

z = list(set(string.split()) - set(stop))

这将打印以下内容:

['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']

单元测试

import unittest

def so_26944574(string):
    stop = ["am", "old", "in", "of"]
    z = list(set(string.split()) - set(stop))
    return sorted(z)

# Unit Test
class Test(unittest.TestCase):
    def testcase(self):
        self.assertEqual(so_26944574("I am 20 years old. I live in New York in United States of America."), sorted(['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']))
        self.assertEqual(so_26944574("I am very old but still strong, kind of"), sorted(['I', 'very', 'but', 'still', 'strong,', 'kind']))
unittest.main()

测试通过

Ran 1 test in 0.000s

OK

答案 3 :(得分:0)

另一种方法是:

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
s_list = s.split() # turn string into list
s = ' '.join([word for word in s_list if word not in stop]) # Make new string
>>> s
'I 20 years old. I live New York United States America.'