如何删除包含在同一字符串列表中的其他字符串中的字符串?

时间:2019-04-09 20:26:17

标签: python string list

我有一个字符串列表,需要删除其他项目中包含的项目,如图所示:

a = ["one", "one single", "one single trick", "trick", "trick must", "trick must get", "one single trick must", "must get", "must get the job done"]

我只需要删除同一列表中另一个字符串中包含的每个字符串,例如:“一个”包含在“单个”中,因此必须将其删除,然后“单个”包含在“单个技巧”中所以也需要删除

我尝试过:

b=a
for item in a:
    for element in b:
        if item in element:
            b.remove(element)

预期结果:

a = ["trick must get", "one single trick must", "must get the job done"]

任何帮助将不胜感激!预先感谢!

3 个答案:

答案 0 :(得分:3)

结合Python的any函数,列表解析应该可以很好地做到这一点:

a = [phrase for phrase in a if not any([phrase2 != phrase and phrase in phrase2 for phrase2 in a])]

结果:

>>> a = ["one", "one single", "one single trick", "trick", "trick must", "trick must get", "one single trick must", "must get", "must get the job done"]
>>> a = [phrase for phrase in a if not any([phrase2 != phrase and phrase in phrase2 for phrase2 in a])]
>>> a
['trick must get', 'one single trick must', 'must get the job done']

答案 1 :(得分:2)

解决O O(n)时间复杂度问题的有效方法是使用一个集合,该集合跟踪给定短语的所有子短语,从最长的字符串迭代到最短的字符串,并且仅将字符串添加到输出中(如果该字符串尚未在子短语集中):

seen = set()
output = []
for s in sorted(a, key=len, reverse=True):
    words = tuple(s.split())
    if words not in seen:
        output.append(s)
    seen.update({words[i: i + n] for i in range(len(words)) for n in range(len(words) - i + 1)})

output变为:

['one single trick must', 'must get the job done', 'trick must get']

答案 2 :(得分:1)

这不是一种有效的解决方案,但是通过将最长到最小排序并删除最后一个元素,我们可以检查每个元素是否在任何地方都显示为子字符串。

a = ['one', 'one single', 'one single trick', 'trick', 'trick must', 'trick must get', 
     'one single trick must', 'must get', 'must get the job done']
a = sorted(a, key=len, reverse=True)
b = []
for i in range(len(a)):
    x = a.pop()
    if x not in "\t".join(a):
        b.append(x)

# ['trick must get', 'must get the job done', 'one single trick must']