在列表中的元素中识别重复的字符串模式,并为每个唯一的重复组创建n个新列表-python

时间:2018-10-08 21:27:35

标签: python-2.7 list duplicates

我有类似这样的列表:

[review_v001,
review_v002,
review_v003,
layerpack_review_v004,
layerpack_review_v001,
x_v001,
x_v002,
x_v003]

我需要将它们重新分组为新列表,然后按下划线之前的字符分组,即[:-5]看起来像这样:

[review_v001,
review_v002,
review_v003]

[layerpack_review_v004,
layerpack_review_v001]

[x_v001,
x_v002,
x_v003]

因此,我需要反复遍历给定的列表,确定列表中的哪些元素从字符串的开头到版本号之前(例如_v001)具有相同的前缀,然后在到基于此共享前缀进行分组的新列表。

这是我的尝试之一,成功识别并几乎将重复分组,只是在重新分组时没有正确命名它们。

fullstringlst=
    [review_v001,
    review_v002,
    review_v003,
    layerpack_review_v004,
    layerpack_review_v001,
    x_v001,
    x_v002,
    x_v003]

prefixList = []
for s in fullstringlst:
    p = s[:-5]
    prefixList.append(p)
    sublists = []
    for item in set(prefixList):
        sublists.append([p] * prefixList.count(item))
    print sublists

1 个答案:

答案 0 :(得分:0)

您可以尝试以下操作:

fullstringlst = ['review_v001', 'review_v002', 'review_v003', 'layerpack_review_v004', 'layerpack_review_v001', 'x_v001', 'x_v002', 'x_v003']

for s1 in fullstringlst:
    similar_strs = []
    for s2 in fullstringlst:
        if s1[:-5] == s2[:-5]:
            similar_strs.append(s2)
    print(similar_strs)