如何在二维列表中返回重复项?

时间:2019-09-30 15:05:49

标签: python python-3.x list multidimensional-array

我有一个列表,它是一个多维二维列表。基本上我想创建一个变量,每行中都有重复项,然后我想创建另一个变量,每行中没有重复项。可以通过列表理解来做到这一点?

public enum StatusEnum {

    NEW,PROCESS,COOKING, DELIVERED, CANCELLED;


    public boolean isFromStage(int stage){

        switch(stage){

            case 1 :
                return this == NEW || this == PROCESS;
            case 2 :
                return this == COOKING || this == DELIVERED || this == CANCELLED;
            default :
                return false;


        }


    }
}

我希望我的结果是:

df = [[[2, 3, 3, 3, 7, 8, 9, 9],[3, 3, 3, 5, 9, 9, 10, 11],[3, 3, 3, 4, 9, 9, 13, 15]], [[2, 3, 3, 3, 4, 4, 5, 6],[4, 4, 5, 7, 7, 7, 8, 10],[4, 4, 6, 7, 7, 7, 9, 11],[3, 3, 3, 4, 4, 8, 11, 12]], [[4, 6, 7, 7, 7, 9, 11, 11],[3, 3, 3, 5, 9, 10, 11, 11],[3, 3, 3, 6, 7, 7, 7, 10, 12, 12]]]

Dup = [[[3,9],[3, 9],[3, 9]],[[3, 4],[4, 7],[4, 7,[3, 4]],[[7, 11],[3, 11],[3, 7, 12]]]

3 个答案:

答案 0 :(得分:3)

这可以使用列表推导和collections.Counter来完成,如下所示:

dup = [[[i for i, c in Counter(sl).items() if c>1] for sl in l] for l in df]
not_in = [[[i for i, c in Counter(sl).items() if c==1] for sl in l] for l in df]

仅供参考,我分别使用了lsl列表和子列表。 i代表项目,csl中该项目的计数。结果如下:

#duplicates
[[[3, 9], [3, 9], [3, 9]], [[3, 4], [4, 7], [4, 7], [3, 4]], [[7, 11], [3, 11], [3, 7, 12]]]
#uniques
[[[2, 7, 8], [5, 10, 11], [4, 13, 15]], [[2, 5, 6], [5, 8, 10],[6, 9, 11], [8, 11, 12]], [[4, 6, 9], [5, 9, 10], [6, 10]]]

答案 1 :(得分:2)

无需其他导入,只需使用双重嵌套的列表理解setcount

>>> [[[x for x in set(ll) if ll.count(x) > 1] for ll in l] for l in df]
[[[3, 9], [3, 9], [3, 9]],
 [[3, 4], [4, 7], [4, 7], [3, 4]],
 [[7, 11], [3, 11], [3, 7, 12]]]

>>> [[[x for x in set(ll) if ll.count(x) == 1] for ll in l] for l in df]
[[[2, 7, 8], [5, 10, 11], [4, 13, 15]],
 [[2, 5, 6], [5, 8, 10], [6, 9, 11], [8, 11, 12]],
 [[4, 6, 9], [5, 9, 10], [6, 10]]]

但是请注意,如果最里面的列表很大,则使用Counter可能会更快。否则不要紧,此版本可能是最简单易懂的版本。

答案 2 :(得分:1)

Dup = [[list(dict.fromkeys([el for i, el in zip(range(len(l)), l) if el in l[:i]+l[i+1:]])) for l in ll] for ll in df]
Not_in = [[[el for i, el in zip(range(len(l)), l) if el not in l[:i]+l[i+1:]] for l in ll] for ll in df]