如何在请求中使用线程?

时间:2019-07-31 05:56:27

标签: python python-3.x multithreading python-2.7 python-multithreading

您好,我正在使用requests模块,并且由于我有许多url,所以我想提高速度,所以我想可以使用线程来提高速度。这是我的代码:

import requests

urls = ["http://www.google.com", "http://www.apple.com", "http://www.microsoft.com", "http://www.amazon.com", "http://www.facebook.com"]
for url in urls:
    reponse = requests.get(url)
    value = reponse.json()

但是我不知道如何在线程中使用请求...

能帮我吗?

谢谢!

2 个答案:

答案 0 :(得分:0)

您可以使用concurrent模块。

    pool = concurrent.futures.thread.ThreadPoolExecutor(max_workers=DEFAULT_NUMBER_OF_THREADS)
    pool.map(lambda x : requests.get(x), urls)

这允许受控的并发。

这是threadpool文档中的direct example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

答案 1 :(得分:0)

只需从bashrc添加,您也可以将其用于请求。 您不需要使用urllib.request方法。

就像这样:

from concurrent import futures

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']
with futures.ThreadPoolExecutor(max_workers=5) as executor: ## you can increase the amount of workers, it would increase the amount of thread created
    res = executor.map(requests.get,URLS)
responses = list(res) ## the future is returning a generator. You may want to turn it to list.

但是,我想做的是创建一个函数,该函数直接从响应(或文本,如果要剪贴)中返回json。 并在线程池中使用该功能

import requests
from concurrent import futures
URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

def getData(url):
   res = requests.get(url)
   try:
       return res.json()
   except:
       return res.text
with futures.ThreadPoolExecutor(max_workers=5) as executor:
    res = executor.map(getData,URLS)
responses = list(res) ## your list will already be pre-formated