有一个巨大的二进制文件上传到Google云端硬盘。我正在开发一个基于龙卷风的HTTP代理服务器,它提供了同一个巨大文件的二进制流。让巨大的文件在多个块中代理是很自然的(使用PyDrive下载内容,使用self.write(chunk)
或其他内容上传)。
问题是似乎有单一选择:googleapiclient.http.MediaIoBaseDownload
从Google云端硬盘下载分块二进制文件,但此库仅支持FD或io.Base
对象,因为它是第一个参数。
我的代码看起来像这样:
import tornado.httpserver
import tornado.ioloop
import tornado.web
from googleapiclient.http import MediaIoBaseDownload
import io
class VideoContentHandler(tornado.web.RequestHandler):
def get(self,googledrive_id):
googledrive_id = googledrive_id[1:]
query = "'"+self.application.google_drive_folder_id+"' in parents and trashed=false"
file_list = self.application.drive.ListFile({'q': query}).GetList()
# io.FileIO will save the chunks to local file!
# This is not what I want.
# Using something different may solve the problem?
with io.FileIO('/tmp/bigvid-from-pydrive.mp4', mode='wb') as local_file:
for f in file_list:
if f['id'] != googledrive_id: continue
id = f.metadata.get('id')
request = self.application.drive.auth.service.files().get_media(fileId=id)
downloader = MediaIoBaseDownload(local_file, request, chunksize=2048*1024)
done = False
while done is False:
status, done = downloader.next_chunk()
# Flush buffer and self.write(chunk)?
def main():
gauth = GoogleAuth()
gauth.CommandLineAuth() # Already done
self.drive = GoogleDrive(gauth)
self.google_drive_folder_id = '<GOOGLE_DRIVE_FOLDER_ID>'
app = tornado.web.Application([
(r"^/videocontents(/.+)?$", handlers.api.VideoContentHandler),
])
http_server = tornado.httpserver.HTTPServer(app)
http_server.listen(8888)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
main()
我应该何时致电self.write(chunk)
?
答案 0 :(得分:1)
您可以使用io.BytesIO
代替io.FileIO
,因为它会更快。
我还没有测试过,但这就是你的代码看起来的样子(阅读评论以获得解释):
from tornado import gen
# need make your get method a coroutine
@gen.coroutine
def get(self, googledrive_id):
...
# with io.FileIO(...) <<<< YOU DON'T NEED THIS LINE NOW
for f in file_list:
...
buffer = io.BytesIO() # create a BytesIO object
downloader = MediaIoBaseDownload(buffer, request, chunksize=2048*1024)
# Now is the time to set appropriate headers
self.set_header('Content-Type', 'video/mp4')
# if you know the size of the video
# write the Content-length header
self.set_header('Content-Length', <size of file in bytes>)
# if not, it's still ok
done = False
while done is False:
status, done = downloader.next_chunk()
# at this point, downloader has written
# the chunk to buffer
# we'll read that data and write it to the response
self.write(buffer.getvalue())
# now fulsh the data to socket
yield self.flush()
# we'll also need to empty the buffer
# otherwise, it will eat up all the RAM
buffer.truncate(0)
# seek to the beginning or else it will mess up
buffer.seek(0)