Python线程 - 我如何使用它来运行多个任务?

时间:2012-08-21 17:29:51

标签: python python-multithreading

我是Python的新手,我从stackoverflow社区获得了一个ruge帮助,以便将我的shellcript迁移到python。但是我再次努力实现如何实现线程,因为这个脚本运行在ax结果上,运行起来会更快,例如,脚本返回120个服务器运行,我想在时间运行5并且有一个队列中。

我希望在线程上运行的方法是在以下条件之后:(我用注释标记)

if checkServer.checkit(host,port):

Bellow,是extract_adapter.py文件内容:

import psycopg2
import urllib2
import base64
import sys
import re
import lxml.html as LH
import checkServer

def extractAdapter(env,family,iserver,login,password,prefix,proxyUser,proxyPass,proxyHost,service):

    print "Starting on \t"+iserver

    proxy_auth = "http://"+proxyUser+":"+proxyPass+"@"+proxyHost
    proxy_handler = urllib2.ProxyHandler({"http": proxy_auth})

    opener = urllib2.build_opener(proxy_handler)
    urllib2.install_opener(opener)
    request = urllib2.Request("http://"+iserver+"/invoke/listRegisteredAdapters")
    base64string = base64.encodestring('%s:%s' % (login, password)).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)
    response = urllib2.urlopen(request)
    html = response.read()

    doc = LH.fromstring(html)
    tds = (td.text_content() for td in doc.xpath("//td[not(*)]"))

    for adapterType, adapterDescription in zip(*[tds]*2):

        proxy_auth = "http://"+proxyUser+":"+proxyPass+"@"+proxyHost
        proxy_handler = urllib2.ProxyHandler({"http": proxy_auth})
        opener = urllib2.build_opener(proxy_handler)
        opener = urllib2.build_opener()
        urllib2.install_opener(opener)
        request = urllib2.Request("http://"+iserver+service+""+adapterType)
        base64string = base64.encodestring('%s:%s' % (login, password)).replace('\n', '')
        request.add_header("Authorization", "Basic %s" % base64string)
        response = urllib2.urlopen(request)
        html2 = response.read()

        doc = LH.fromstring(html2)
        tds = (td.text_content() for td in doc.xpath("//td[not(*)]"))

        for connectionAlias,packageName,connectionFactoryType,mcfDisplayName,connectionState,hasError in zip(*[tds]*6):

            cur.execute("INSERT INTO wip.info_adapter (env,family,iserver,prefix,package,adapter_type,connection_name,status) values (%s,%s,%s,%s,%s,%s,%s,%s)",
            (env,family,iserver,prefix,packageName,adapterType,connectionAlias,connectionState))
            con.commit()

################################################################################

def extract(env):
    global cur,con
    con = None
    try:

        con = psycopg2.connect(database='xx', user='xx',password='xxx',host='localhost')
        cur = con.cursor()
        qry=" random non important query"

        cur.execute(qry)
        data = cur.fetchall()

        for result in data:

            family   = result[0]
            prefix   = result[1]
            iserver  = result[2]
            version  = result[3]
            login    = result[4]
            password = result[5]
            service  = result[6]
            proxyHost = result[7]
            proxyUser = result[8]
            proxyPass = result[9]

            parts=iserver.split(":")
            host=parts[0]
            port=parts[1]

            if checkServer.checkit(host,port):
            ##SUPOSE TO AS START THREAD 

                if version == '7' or version == '8':

                    extractAdapter(env,family,iserver,login,password,prefix,proxyUser,proxyPass,proxyHost,service)

                elif version == '60' or version == '61':
                    print "Version 6.0 and 6.1 not supported yet"
            else:
                print iserver+"is offline"
            #TO END  THREAD

    except psycopg2.DatabaseError, e:
        print 'Error %s' % e
        sys.exit(1)

    finally:

        if con:
            con.close()

这就是我在runme.py上调用方法提取的方式

import extract_adapter_thread
from datetime import datetime

startTime = datetime.now()
print"------------------------------"
extract_adapter_thread.extract('TEST')
print"------------------------------"
print(datetime.now()-startTime)

顺便说一下,代码运行得很好。没有错误。

3 个答案:

答案 0 :(得分:1)

由于Global Interpreter Lock,线程将在Python内非常严重地阻止非IO绑定问题。因此,你可能最好做multiprocessing - 它带有一个Queue类(请参阅此SO Link以获取使用mp队列的示例)。

这可以让您同时处理许多单独的进程(例如,在120个时间内对5个作业进行批处理)。请注意,进程的开销高于线程的开销,因此对于小任务,您将为使用多处理而不是线程付出代价。你的任务听起来足够大,可以保证这样的费用。

答案 1 :(得分:0)

如果一切都是线程安全的,您可以使用threading模数:

import threading
starttime=datetime.now()
print "-"*10
code=threading.thread(target=extract_adapter_thread.extract,args=['TEST'])
code.daemon=True
code.start()
print "-"*10
print(datetime.now()-starttime)

答案 2 :(得分:0)

我真的不知道这是否会有多大帮助,但是我在HD中的代码片段很好......就在这里。基本的看法是并行或顺序ping一些ips(尽管需要linux)。它非常简单,并不能直接回答你的具体问题,但是......既然你说你是的新手,它可能会给你一些想法。

#!/usr/bin/env python

import datetime
import subprocess
import threading

ipsToPing = [
    "google.com",
    "stackoverflow.com",
    "yahoo.com",
    "terra.es", 
]

def nonThreadedPinger():
    start = datetime.datetime.now()
    for ipToPing in ipsToPing:
        print "Not-threaded ping to %s" % ipToPing
        subprocess.call(["/bin/ping", "-c", "3", "-W", "1.0", ipToPing], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    end = datetime.datetime.now()
    print ("Non threaded ping of %s ips took: %s." % (len(ipsToPing), end-start))

def _threadedPingerAux(ipToPing):
    print "Threaded ping to %s" % ipToPing
    subprocess.call(["/bin/ping", "-c", "3", "-W", "1.0", ipToPing], stdout=subprocess.PIPE, stderr=subprocess.PIPE)

def threadedPinger():
    retval = dict.fromkeys(ipsToPing, -1)
    threads = list()
    start = datetime.datetime.now()
    for ipToPing in ipsToPing:
        thread = threading.Thread(target=_threadedPingerAux, args=[ipToPing])
        thread.start()
        threads.append(thread)
    for thread in threads:
        thread.join()
    end = datetime.datetime.now()
    print ("Treaded ping of %s ips took: %s" % (len(ipsToPing), end-start))


if __name__ == "__main__":
    threadedPinger()
    nonThreadedPinger()