Question

有没有办法使用多处理界面加速我的代码？问题是这个接口使用map函数，它只能用于1个函数。但我的代码有3个功能。我试图将我的功能合二为一，但没有取得成功。我的脚本从文件中读取站点的URL，并在其上执行3个功能。 For Loop使得它非常慢，因为我有很多URL

import requests

def Login(url): #Log in     
    payload = {
        'UserName_Text'     : 'user',
        'UserPW_Password'   : 'pass',
        'submit_ButtonOK'   : 'return buttonClick;'  
      }

    try:
        p = session.post(url+'/login.jsp', data = payload, timeout=10)
    except (requests.exceptions.ConnectionError, requests.exceptions.Timeout):
        print "site is DOWN! :", url[8:]
        session.cookies.clear()
        session.close() 
    else:
        print 'OK: ', p.url

def Timer(url): #Measure request time
    try:
        timer = requests.get(url+'/login.jsp').elapsed.total_seconds()
    except (requests.exceptions.ConnectionError):
        print 'Request time: None'
        print '-----------------------------------------------------------------'
    else: 
        print 'Request time:', round(timer, 2), 'sec'

def Logout(url): # Log out
    try:
        logout = requests.get(url+'/logout.jsp', params={'submit_ButtonOK' : 'true'}, cookies = session.cookies)
    except(requests.exceptions.ConnectionError):
        pass
    else:
        print 'Logout '#, logout.url
        print '-----------------------------------------------------------------'
        session.cookies.clear()
        session.close()
for line in open('text.txt').read().splitlines():
    session = requests.session()
    Login(line)
    Timer(line)
    Logout(line)

Answer 1

是的，您可以使用多处理。

from multiprocessing import Pool

def f(line):
    session = requests.session()
    Login(session, line)
    Timer(session, line)
    Logout(session, line)        

if __name__ == '__main__':
    urls = open('text.txt').read().splitlines()
    p = Pool(5)
    print(p.map(f, urls))

请求session不能是全局的，并且在工作者之间共享，每个工作者都应该使用自己的会话。

你写道，你已经尝试将我的功能合并为一个，但没有取得成功＆＃34;。究竟什么不起作用？

Answer 2

有很多方法可以完成你的任务，但是在那个级别不需要多处理，只会增加复杂性， imho 。

请看看gevent，greenlets和猴子补丁！

一旦你的代码准备就绪，你可以将一个main函数包装到gevent循环中，如果你应用了Monkey补丁，gevent框架将同时运行N个作业（你可以创建一个作业池，设置并发的限制，等）

这个例子应该有所帮助：

#!/usr/bin/python
# Copyright (c) 2009 Denis Bilenko. See LICENSE for details.

"""Spawn multiple workers and wait for them to complete"""
from __future__ import print_function
import sys

urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']

import gevent
from gevent import monkey

# patches stdlib (including socket and ssl modules) to cooperate with other greenlets
monkey.patch_all()


if sys.version_info[0] == 3:
    from urllib.request import urlopen
else:
    from urllib2 import urlopen


def print_head(url):
    print('Starting %s' % url)
    data = urlopen(url).read()
    print('%s: %s bytes: %r' % (url, len(data), data[:50]))

jobs = [gevent.spawn(print_head, url) for url in urls]

gevent.wait(jobs)

您可以在here以及此示例所在的Github repository找到更多{{3}}

P.S。 Greenlets也可以处理请求，您无需更改代码。

Python请求模块多线程

2 个答案: