Question

我的嵌套列表很长。每个子列表包含2个元素。我想做的是遍历整个列表，并在我发现第一个元素超过3次之后删除子列表。

示例：

ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]

desired_result = [[1,1], [1,2], [1,3], [2,2], [2,3], [3,4], [3,5], [3,6]]

Answer 1

如果输入按第一个元素排序，则可以使用groupby和islice：

from itertools import groupby, islice
from operator import itemgetter

ls = [[1, 1], [1, 2], [1, 3], [1, 4], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6], [3, 7]]

result = [e for _, group in groupby(ls, key=itemgetter(0)) for e in islice(group, 3)]
print(result)

输出

[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

想法是使用groupby将元素按第一个值分组，然后使用islice获取前三个值（如果存在）。

Answer 2

可能不是最短的答案。

这个想法是在迭代ls

时计算发生次数

from collections import defaultdict

filtered_ls = []
counter = defaultdict(int)
for l in ls: 
    counter[l[0]] += 1
    if counter[l[0]] > 3:
        continue
    filtered_ls += [l]
print(filtered_ls)
# [[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Answer 3

如果列表已经排序，则可以使用itertools.groupby，然后仅保留每个组的前三项

>>> import itertools
>>> ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]
>>> list(itertools.chain.from_iterable(list(g)[:3] for _,g in itertools.groupby(ls, key=lambda i: i[0])))
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Answer 4

您可以按照以下方式进行操作：

ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]

val_count = dict.fromkeys(set([i[0] for i in ls]), 0)

new_ls = []
for i in ls:
    if val_count[i[0]] < 3:
        val_count[i[0]] += 1 
        new_ls.append(i)

print(new_ls)

输出：

[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Answer 5

您可以使用collections.defaultdict来按O（ n ）时间的第一个值进行汇总。然后使用itertools.chain来构建列表列表。

from collections import defaultdict
from itertools import chain

dd = defaultdict(list)
for key, val in ls:
    if len(dd[key]) < 3:
        dd[key].append([key, val])

res = list(chain.from_iterable(dd.values()))

print(res)

# [[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Answer 6

Ghillas贝勒哈吉回答良好。但是您应该为此任务考虑defaultdict。这个想法来自Raymond Hettinger，他建议使用defaultdict对任务进行分组和计数

from collections import defaultdict

def remove_sub_lists(a_list, nth_occurence):
    found = defaultdict(int)
    for sublist in a_list:
        first_index = sublist[0]
        print(first_index)
        found[first_index] += 1
        if found[first_index] <= nth_occurence:
            yield sublist

max_3_times_first_index = list(remove_sub_lists(ls, 3)))

Answer 7

以下是不使用任何模块的选项：

countDict = {}

for i in ls:
    if str(i[0]) not in countDict.keys():
        countDict[str(i[0])] = 1
    else:
        countDict[str(i[0])] += 1
        if countDict[str(i[0])] > 3:
            ls.remove(i)

在第一个元素出现n次后删除子列表

7 个答案: