python:filter()一个可迭代的,计数已过滤和未过滤的项目

时间:2018-08-15 02:32:49

标签: python python-3.x dictionary filter iterator

我的Iterable很大。
我想使用filter()函数对其进行过滤。
如何计算(以某种优雅的方式)过滤了多少个项目?
(相同的问题可能针对map()reduce()等)

确保我可以做到:

items = get_big_iterable()
count_good = 0
count_all = 0
for item in items:
    if should_keep(item):
        count_good += 1
    count_all += 1

print('keep: {} of {}'.format(count_good, count_all))

使用filter()是否有可能?

items = filter(should_keep, get_big_iterable()) 
    for item in items:
        #... using values here ..
        #possible count not filtered items here too? 

我不应该重复两次,并且想使用filter()或类似的解决方案

3 个答案:

答案 0 :(得分:1)

使用enumerate和一些基本算术应该很简单:

def should_keep(x):
    return x % 3 == 0

items = range(1, 28)


def _wrapper(x):
    return should_keep(x[1])

filtered_with_counts = enumerate(filter(_wrapper, enumerate(items, 1)), 1)

for i, (j, item) in filtered_with_counts:
    # do something with item
    print(f"Item is {item}, total: {j}, good: {i}, bad: {j-i}")

count_all = j
count_good = i
count_bad = count_all - count_good
print(f"Final: {count_all}, {count_good}, {count_bad}")

输出:

Item is 3, total: 3, good: 1, bad: 2
Item is 6, total: 6, good: 2, bad: 4
Item is 9, total: 9, good: 3, bad: 6
Item is 12, total: 12, good: 4, bad: 8
Item is 15, total: 15, good: 5, bad: 10
Item is 18, total: 18, good: 6, bad: 12
Item is 21, total: 21, good: 7, bad: 14
Item is 24, total: 24, good: 8, bad: 16
Item is 27, total: 27, good: 9, bad: 18
Final: 27, 9, 18

我可能不会使用它。请注意,我假设您可能不想修改should_keep,但是您可以随时对其进行包装。

答案 1 :(得分:1)

我可以想到两种方法:第一种是简短的,但可能对性能不利,并且不利于拥有迭代器的目的:

count=len(list(your_filtered_iterable))

另一种方法是编写自己的过滤器。根据Python文档:

  

请注意,filter(function, iterable)等效于生成器   表达式(item for item in iterable if function(item))如果函数   不是None和(item for item in iterable if item)(如果function是)   没有。

因此您可以编写如下内容:

class Filter:
    def __init__(self, func, iterable):
        self.count_good = 0
        self.count_all = 0
        self.func = func
        self.iterable = iterable

    def __iter__(self):
        if self.func is None:
            for obj in self.iterable:
                if obj:
                    self.count_good += 1
                    self.count_all += 1
                    yield obj
                else:
                    self.count_all += 1
        else:
            for obj in self.iterable:
                if self.func(obj):
                    self.count_good += 1
                    self.count_all += 1
                    yield obj
                else:
                    self.count_all += 1

然后,您可以从count_good实例访问count_allFilter

items = Filter(should_keep, get_big_terable()) 
    for item in items:
        # do whatever you need with item
        print('keep: {} of {}'.format(items.count_good, items.count_all))

答案 2 :(得分:1)

内置filter不提供此功能。您需要编写自己的过滤器类,实现其__next____iter__方法。

代码

class FilterCount:
    def __init__(self, function, iterable):
        self.function = function
        self.iterable = iter(iterable)
        self.countTrue, self.countFalse = 0, 0

    def __iter__(self):
        return self

    def __next__(self):
        nxt = next(self.iterable)
        while not self.function(nxt):
            self.countFalse += 1
            nxt = next(self.iterable)

        self.countTrue += 1
        return nxt

示例

lst = ['foo', 'foo', 'bar']
filtered_lst = FilterCount(lambda x: x == 'foo', lst)

for x in filtered_lst:
    print(x)
print(filtered_lst.countTrue)
print(filtered_lst.countFalse)

输出

foo
foo
2
1