我真的很喜欢Python生成器。特别是,我发现它们只是连接到Rest端点的正确工具 - 我的客户端代码只需要迭代连接端点的生成器。但是,我发现Python的发生器并不像我想的那样富有表现力。通常,我需要过滤从端点获取的数据。在我当前的代码中,我将谓词函数传递给生成器,它将谓词应用于它正在处理的数据,并且只有谓词为True时才生成数据。
我想转向生成器的组合 - 比如 data_filter(datasource())。这是一些演示代码,显示了我尝试过的内容。很明显为什么它不起作用,我想弄清楚的是什么是最有表现力的解决方案:
# Mock of Rest Endpoint: In actual code, generator is
# connected to a Rest endpoint which returns dictionary(from JSON).
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
# Mock of a filter: simplification, in reality I am filtering on some
# aspect of the data, like data['type'] == "external"
def data_filter (d):
if len(d) < 8:
yield d
# First Try:
# for w in data_filter(mock_datasource()):
# print(w)
# >> TypeError: object of type 'generator' has no len()
# Second Try
# for w in (data_filter(d) for d in mock_datasource()):
# print(w)
# I don't get words out,
# rather <generator object data_filter at 0x101106a40>
# Using a predicate to filter works, but is not the expressive
# composition I am after
for w in (d for d in mock_datasource() if len(d) < 8):
print(w)
答案 0 :(得分:4)
data_filter
应该len
应用d
的元素而不是d
本身,如下所示:
def data_filter (d):
for x in d:
if len(x) < 8:
yield x
现在你的代码:
for w in data_filter(mock_datasource()):
print(w)
返回
liberty
seminar
formula
comedy
答案 1 :(得分:1)
更简洁地说,您可以直接使用生成器表达式执行此操作:
def length_filter(d, minlen=0, maxlen=8):
return (x for x in d if minlen <= len(x) < maxlen)
将过滤器应用于您的生成器,就像常规函数一样:
for element in length_filter(endpoint_data()):
...
如果您的谓词非常简单,内置函数filter
也可能满足您的需求。
答案 2 :(得分:0)
您可以传递适用于每个项目的过滤器功能:
def mock_datasource(filter_function):
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield filter_function(d)
def filter_function(d):
# filter
return filtered_data
答案 3 :(得分:0)
我要做的是定义filter(data_filter)
来接收生成器作为输入,并返回带有由data_filter
谓词(常规谓词,不知道生成器接口)过滤的值的生成器。
代码是:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
def generator(coll):
for x in coll:
if pred(x):
yield x
return generator
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
def data_filter (d):
if len(d) < 8:
return True
gen1 = mock_datasource()
filtering = filter(data_filter)
gen2 = filtering(gen1) # or filter(data_filter)(mock_datasource())
print(list(gen2))
如果您想进一步改进,可以使用compose
,这是我的全部意图:
from functools import reduce
def compose(*fns):
"""Compose functions left to right - allows generators to compose with same
order as Clojure style transducers in first argument to transduce."""
return reduce(lambda f,g: lambda *x, **kw: g(f(*x, **kw)), fns)
gen_factory = compose(mock_datasource,
filter(data_filter))
gen = gen_factory()
print(list(gen))
PS:我使用了发现的here代码,Clojure的人员在此表达了发电机的组成,灵感来自于发电机与换能器的一般组成方式。
PS2:filter
可以用更Python化的方式编写:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
return lambda coll: (x for x in coll if pred(x))