我正在编写一个服务,它将在收到请求后从S3检索数据。数据存储在.gz文件中。这项服务不会经常充斥着数据,这意味着它可能在收到第二个请求之前需要几秒钟。我在很短的时间内无法重置S3连接。问题似乎是boto3.client()
比我想要的更快地重置连接。
为了测试,我使用了这段代码:
import boto3
import logging
import datetime
import time
import gzipinputstream
logging.basicConfig(level='DEBUG')
logging.getLogger('botocore').setLevel('INFO')
s3_client = boto3.client('s3')
bucket = 'foo'
key = 'bar'
count = 0
while True:
count += 1
start = datetime.datetime.now()
x = s3_client.get_object(Bucket=bucket, Key=key)
y = x['Body']
z = gzipinputstream.GzipInputStream(y)
final_obj = z.read()
end = datetime.datetime.now()
print "Test #%d: started at %s, ended at %s, duration = %s" % (count,start,end,end-start)
当我运行上面的代码时,我看到以下内容。初始请求比其他请求花费的时间要长一些,但是#2以后的每个请求都要快得多:
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): session4c.s3.amazonaws.com
Test #1: started at 2017-01-25 14:50:26.295239, ended at 2017-01-25 14:50:30.412478, duration = 0:00:04.117239
Test #2: started at 2017-01-25 14:50:30.412581, ended at 2017-01-25 14:50:30.447595, duration = 0:00:00.035014
Test #3: started at 2017-01-25 14:50:30.447655, ended at 2017-01-25 14:50:30.474377, duration = 0:00:00.026722
Test #4: started at 2017-01-25 14:50:30.474443, ended at 2017-01-25 14:50:30.499979, duration = 0:00:00.025536
Test #5: started at 2017-01-25 14:50:30.500040, ended at 2017-01-25 14:50:30.595240, duration = 0:00:00.095200
当我使用相同的代码并将time.sleep(10)
添加到循环的底部以模拟请求之间的差距时,我会看到以下内容。由于每个请求重新连接,每个请求大约与第一个请求一样长:
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): session4c.s3.amazonaws.com
Test #1: started at 2017-01-25 14:50:44.916388, ended at 2017-01-25 14:50:49.315392, duration = 0:00:04.399004
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Resetting dropped connection: session4c.s3.amazonaws.com
Test #2: started at 2017-01-25 14:50:59.325521, ended at 2017-01-25 14:51:03.726388, duration = 0:00:04.400867
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Resetting dropped connection: session4c.s3.amazonaws.com
Test #3: started at 2017-01-25 14:51:13.736561, ended at 2017-01-25 14:51:17.273182, duration = 0:00:03.536621
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Resetting dropped connection: session4c.s3.amazonaws.com
Test #4: started at 2017-01-25 14:51:27.282636, ended at 2017-01-25 14:51:31.682258, duration = 0:00:04.399622
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Resetting dropped connection: session4c.s3.amazonaws.com
Test #5: started at 2017-01-25 14:51:41.692450, ended at 2017-01-25 14:51:45.225243, duration = 0:00:03.532793
我已经搜索了高位和低位的方法来增加boto3.client()
及其基础requests
和urllib3
库中的超时但是已经空了。我在boto3文档中也看不到任何内容。将use_ssl=False
添加到boto3.client()
调用确实有助于减少网络聊天。重新连接在20-30秒之后而不是在<10秒内发生。
有没有办法增加S3连接保持打开的时间?任何帮助将不胜感激!
答案 0 :(得分:1)
不,没有。
几秒钟后,S3本身就会丢弃任何空闲的Keep-Alive连接...因为任何Web服务器都会这样做...&#34;连接&#34;到S3只是到S3 API端点的HTTP / S连接。它们并不意味着可以长时间保持使用。真正的问题可能是为什么连接需要4秒才能建立。这似乎过分了。