如何使用生成器形成多个管道?

时间:2017-08-28 17:05:11

标签: python generator

我正在使用python,而我正试图找到一种优雅地将多个生成器链接在一起的方法。问题的一个例子是例如一个根生成器,它提供某种数据,每个值都传递给它的" children"就像一个级联,反过来可能会修改他们收到的对象。我可以选择这条路线:

for x in gen1:
    gen2(x)
    gen3(x)

但它丑陋而不优雅。我正在考虑一种更实用的做事方式。

4 个答案:

答案 0 :(得分:3)

您可以将生成器转换为协同程序,以便它们可以send()并相互接收值(使用(yield)表达式)。这将使每个人有机会改变他们收到的值,和/或将它们传递给下一个生成器/协同程序(或完全忽略它们)。

请注意,在下面的示例代码中,我使用了一个名为coroutine的装饰器来" prime"发电机/协程功能。这导致它们在第一个yield表达式/语句之前执行。它是YouTube视频中显示的一个略微修改过的版本,其中一个标题为A Curious Course on Coroutines and Concurrency的Dave Beazley在PyCon 2009上发表了这篇文章。

正如您应该能够从生成的输出中看到的那样,数据值正由通过单个send()配置到头部协程的每个管道处理,然后有效地"多路复用&#34 ;它在每个管道下面。由于每个子协同程序也可以这样做,因此可以设置一个精心制作的树#34;过程。

import sys

def coroutine(func):
    """ Decorator to "prime" generators used as coroutines. """
    def start(*args,**kwargs):
        cr = func(*args,**kwargs)  # Create coroutine generator function.
        next(cr)                   # Advance to just before its first yield.
        return cr
    return start

def pipe(name, value, divisor, coroutines):
    """ Utility function to send values to list of coroutines. """
    print('  {}: {} is divisible by {}'.format(name, value, divisor))
    for cr in coroutines:
        cr.send(value)

def this_func_name():
    """ Helper function that returns name of function calling it. """
    frame = sys._getframe(1)
    return frame.f_code.co_name


@coroutine
def gen1(*coroutines):
    while True:
        value = (yield)     # Receive values sent here via "send()".
        if value % 2 == 0:  # Only pipe even values.
            pipe(this_func_name(), value, 2, coroutines)

@coroutine
def gen2(*coroutines):
    while True:
        value = (yield)     # Receive values sent here via "send()".
        if value % 4 == 0:  # Only pipe values divisible by 4.
            pipe(this_func_name(), value, 4, coroutines)

@coroutine
def gen3(*coroutines):
    while True:
        value = (yield)     # Receive values sent here via "send()".
        if value % 6 == 0:  # Only pipe values divisible by 6.
            pipe(this_func_name(), value, 6, coroutines)

# Create and link together some coroutine pipelines.
g3 = gen3()
g2 = gen2()
g1 = gen1(g2, g3)

# Send values through both pipelines (g1 -> g2, and g1 -> g3) of coroutines.
for value in range(17):
    print('piping {}'.format(value))
    g1.send(value)

输出:

piping 0
  gen1: 0 is divisible by 2
  gen2: 0 is divisible by 4
  gen3: 0 is divisible by 6
piping 1
piping 2
  gen1: 2 is divisible by 2
piping 3
piping 4
  gen1: 4 is divisible by 2
  gen2: 4 is divisible by 4
piping 5
piping 6
  gen1: 6 is divisible by 2
  gen3: 6 is divisible by 6
piping 7
piping 8
  gen1: 8 is divisible by 2
  gen2: 8 is divisible by 4
piping 9
piping 10
  gen1: 10 is divisible by 2
piping 11
piping 12
  gen1: 12 is divisible by 2
  gen2: 12 is divisible by 4
  gen3: 12 is divisible by 6
piping 13
piping 14
  gen1: 14 is divisible by 2
piping 15
piping 16
  gen1: 16 is divisible by 2
  gen2: 16 is divisible by 4

答案 1 :(得分:1)

管道可能看起来更像这样:

for x in gen3(gen2(gen1())):
    print x

例如:

for i, x in enumerate(range(10)):
    print i, x

没有办法在Python中分叉(或“tee”)管道。如果您需要多个管道,则必须复制它们:gen2(gen1())gen3(gen1())

答案 2 :(得分:0)

Dave Beazley gave this example in a talk he did in 2008。目标是总结在Apache Web服务器日志中传输了多少字节的数据。假设日志格式如:

81.107.39.38 -  ... "GET /ply/ HTTP/1.1" 200 7587
81.107.39.38 -  ... "GET /favicon.ico HTTP/1.1" 404 133
81.107.39.38 -  ... "GET /admin HTTP/1.1" 403 -

传统(非生成器)解决方案可能如下所示:

with open("access-log") as wwwlog:
    total = 0
    for line in wwwlog:
        bytes_as_str = line.rsplit(None,1)[1]
        if bytes_as_str != '-':
            total += int(bytes_as_str)
print("Total: {}".format(total))

使用生成器表达式的生成器管道可以显示为:

access-log => wwwlog => bytecolumn => bytes => sum() => total

可能看起来像:

with open("access-log") as wwwlog:
    bytecolumn = (line.rsplit(None,1)[1] for line in wwwlog)
    bytes = (int(x) for x in bytecolumn if x != '-')
print("Total: {}".format(sum(bytes)))

Dave Beazley的幻灯片和更多示例on his websiteHis later presentations elucidate this further

如果不确切地知道你要做什么就很难说更多,所以我们可以评估你所做的每一件事是否都需要一个自定义发电机(发电机表情/理解可以很好地适用于许多事情而没有需要声明生成器函数)。

答案 3 :(得分:0)

以下是一个简洁的例子:

def negate_nums(g):
    for x in g:
        yield -x

def square_nums(g):
    for x in g:
        yield x ** 2

def half_num(g):
    for x in g:
        yield x / 2.0

def compose_gens(first_gen,*rest_gens):
    newg = first_gen(compose_gens(*rest_gens)) if rest_gens else first_gen
    return newg

for x in compose_gens(negate_nums,square_nums,half_num,range(10)):
    print(x)

在这里,您要编写生成器,以便在最终compose_gens调用中从右到左调用它们。您可以通过反转参数将其更改为管道。