我是新发布HTTP请求所以我在请求库文档中阅读的很多东西都是我的头脑。我尝试自动下载大量文件,据我所知,如果我使用requests.Sessions()
然后get()
每个文件(下面的第一个示例),请求将保持与服务器的连接和进程将更快。不使用requests.Sessions()
(第二个示例),代码将为每个get()
建立与服务器的新连接。这是对下面两个例子中正在发生的事情的正确理解吗?
requests.Session()
import requests
urls = ['https://www.sec.gov/Archives/edgar/full-index/2016/QTR1/master.idx',
'https://www.sec.gov/Archives/edgar/full-index/2016/QTR2/master.idx',
'https://www.sec.gov/Archives/edgar/full-index/2016/QTR3/master.idx']
with requests.Session() as s:
for url in urls:
doc = s.get(url)
# Process file
没有requests.Session()
import requests
urls = ['https://www.sec.gov/Archives/edgar/full-index/2016/QTR1/master.idx',
'https://www.sec.gov/Archives/edgar/full-index/2016/QTR2/master.idx',
'https://www.sec.gov/Archives/edgar/full-index/2016/QTR3/master.idx']
for url in urls:
doc = requests.get(url)
# Process file