将grequest与地址块一起使用还是一次使用都不同?

时间:2019-03-24 08:07:57

标签: python http grequests

我尝试下载约6000个http地址的/home.htm文件。为了提高速度,我尝试使用grequests一次将它们全部发送出去,但是我只能得到大约200个答案,其中大多数给出的连接拒绝错误。当我将地址分成100个大块然后单独发送每个大块时,大约有1200个地址会回答我(=下载他们的/home.htm成功),即使我使用的地址与以前相同。

我在Ubuntu 16.04上使用Python3.6运行它。

import grequests
import requests
import sys
import os
import resource

# Counts exceptions and prints them
def exceptionh(request, exception):
...

# Yields succesive n-sized chunks
def make_chunks(req, n):
    for i in range(0, len(req), n):
         yield req[i:i+n]

def run(ipport):
    # Make http links
    http_links = []
    for ip in ipport:
        http_links.append('http://' + ip.strip() + '/home.htm')

    # changing limit, without it there are too many Errno24 Exceptions
    resource.setrlimit(resource.RLIMIT_NOFILE, (131072, 131072))

    # Request making
    rq = []
    ctr = 0
    for link in http_links:
        rq.append(grequests.get(link, timeout=30, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36'}, stream=False))
        ctr += 1
    rq = list(make_chunks(rq, 100))
    # Send requests
    results = []
    for chunk in rq:
        results.append(grequests.map(chunk, exception_handler=exceptionh))

    # Save .html
    for chunk in results:
        for response in chunk:
            if response is not None:
                # write it in html file

如上所述,结果有所不同。当我分块发送请求时,与一次发送所有请求相比,获得的结果更多。 这是为什么?有解决这个问题的更好方法吗?

0 个答案:

没有答案