在python龙卷风上运行并行功能

时间:2017-07-24 11:06:25

标签: python parallel-processing tornado coroutine

我目前正在tornado框架上@gen.coroutine def dosomethingfunc(self, env): print("Do something") self.downloadfunc(file_url, target_path) #I don't want to wait here print("Do something else") @gen.coroutine def downloadfunc(self, file_url, target_path): response = urllib.request.urlopen(file_url) CHUNK = 16 * 1024 with open(target_path, 'wb') as f: while True: chunk = response.read(CHUNK) if not chunk: break f.write(chunk) time.sleep(0.1) #do something after a chunk is downloaded - sleep only as example (仍然是初学者)开发,我有一个功能,我想在后台运行。更准确地说,该函数的任务是下载一个大文件(chunk by chunk),并且在下载每个chunk之后可能会做更多的事情。但是调用函数不应该等待下载函数完成,而应该继续执行。

这里有一些代码示例:

@gen.coroutine

我已经在stackoverflow https://stackoverflow.com/a/25083098/2492068上阅读了这个答案,并尝试使用它。

实际上我想如果我使用yielddosomethingfunc downloadfunc "Do something else会继续而不等待downloadfunc完成。但实际上行为是相同的(产量与否) - {{1}}"只会在{{1}}完成下载后打印。

我在这里失踪了什么?

1 个答案:

答案 0 :(得分:2)

为了受益于Tornado的异步,必须在某些时候yielded使用非阻塞功能。由于downloadfunc的代码都是阻塞的,因此dosomethingfunc在被调用的函数完成之前不会返回控制。

您的代码存在夫妻问题:

  • time.sleep正在屏蔽,请改用<{3}},
  • urllib的urlopen正在屏蔽,请使用tornado.gen.sleep

所以downloadfunc看起来像:

@gen.coroutine
def downloadfunc(self, file_url, target_path):

    client = tornado.httpclient.AsyncHTTPClient()

    # below code will start downloading and
    # give back control to the ioloop while waiting for data
    res = yield client.fetch(file_url)

    with open(target_path, 'wb') as f:
        f.write(res)
        yield tornado.gen.sleep(0.1)

要通过流式传输(通过块)支持来实现它,您可能希望这样做:

# for large files you must increase max_body_size
# because deault body limit in Tornado is set to 100MB

tornado.web.AsyncHTTPClient.configure(None, max_body_size=2*1024**3)

@gen.coroutine
def downloadfunc(self, file_url, target_path):

    client = tornado.httpclient.AsyncHTTPClient()

    # the streaming_callback will be called with received portion of data
    yield client.fetch(file_url, streaming_callback=write_chunk)

def write_chunk(chunk):
    # note the "a" mode, to append to the file
    with open(target_path, 'ab') as f:
        print('chunk %s' % len(chunk))
        f.write(chunk)

现在你可以在没有dosomethingfunc的情况下在yield中调用它,其余功能将继续。

修改

从服务器端和客户端都不支持(公开)修改块大小。您还可以查看tornado.httpclient.AsyncHTTPClient