Python多线程缺失的工作

时间:2014-04-09 06:49:26

标签: python multithreading automation mechanize webautomation

如果我一步一步地运行脚本完美地工作,但是当我使用线程时错过了50-60%。我正在使用Python +机械化模块

#setting up the browser
mySite = 'http://example.com/managament.php?'
postData = {'UserID' : '', 'Action':'Delete'}
job_tab1_user1 = [1,2,3]
job_tab2_user1 = [4,5,6]
job_tab1_user2 = [7,8,9]
job_tab2_user2 = [10,12,13]
.... till user1000
#i want to point out that the lists are 100% different
def user1_jobs:
    for i in job_tab1_user1:
        browser.open("http://example.com/jobs.php?actions="+i) 
        browser.open(mySite, Post_data)
    for i in job_tab2_user1:
        browser.open("http://example.com/jobs.php?actions="+i) 
        browser.open(mySite, Post_data)
def user2_jobs:
    for i in job_tab1_user2:
        browser.open("http://example.com/jobs.php?actions="+i) 
        browser.open(mySite, Post_data)
    for i in job_tab2_user2:
        browser.open("http://example.com/jobs.php?actions="+i) 
        browser.open(mySite, Post_data)
... and so on till user 1000

我最后这样称呼他们:

t_user1 = threading.Thread(target=user1_jobs, args=[])
t_user1.start()
t_user2 = threading.Thread(target=user2_jobs, args=[])
t_user2.start()

我有一个类似的脚本,每秒发送200个请求,并且所有这些脚本都被处理。我也尝试过使用time.sleep(2),但又失去了很多。 除了我的脚本有什么问题之外,另一个问题是它是否能够压缩这个代码,因为我使用的是1000个用户,而且脚本达到了数千行。提前谢谢。

2 个答案:

答案 0 :(得分:1)

from threading import *

submits = [[1,2,3], [3,4,5], [6,7,8]]    

class worker(Thread):
    def __init__(self, site, postdata, data):
        Thread.__init__(self)
        self.data = data
        self.site = site
        self.postdata = postdata
        self.start()
    def run(self):
        for i in self.data:
            browser.open("http://example.com/jobs.php?actions="+str(i))
            browser.open(self.site, self.postdata)
for obj in submits:
    worker('http://example.com/managament.php?', {'UserID' : '', 'Action':'Delete'}, submits)

由于OP要求它,这里是代码的压缩/压缩版本。

或:

for index in range(0,1000):
    worker('http://example.com/managament.php?', {'UserID' : '', 'Action':'Delete'}, [i for i in range(1,4)])

如果您要发送的数据实际上是一个3个整数(1,2,3)的序列,它以完美的顺序倾斜。

答案 1 :(得分:1)

这是一个完整的脚本,您可以通过更改初始变量轻松修改。 它动态创建一个列表,并使用生成器为每个线程创建函数。 目前,它创建了1000个用户,每个用户有2个标签和3个作业。

# define your variables here
NUM_USERS = 1000
NUM_JOBS_PER_USER = 3
NUM_TABS_PER_USER = 2
URL_PART = "http://example.com/jobs.php?actions="

# populate our list of jobs
# the structure is like this: jobs[user][tab][job]

jobs = [[[0 for y in range(NUM_JOBS_PER_USER)] \
            for x in range(NUM_TABS_PER_USER)] \
            for x in range(NUM_USERS)]
p = 1
for i in range(NUM_USERS):
    for j in range(NUM_TABS_PER_USER):
        for k in range(NUM_JOBS_PER_USER):
            jobs[i][j][k] = p
            p += 1


# create a generator that builds our thread functions
def generateFunctions(jobs):
    for user in jobs:
        for tab in user:
            for job in tab:
                def f():
                    browser.open(URL_PART + str(job))
                    browser.open(mySite, Post_data)
                yield f

# create and start threads, add them to a list
# if we need to preserve handlers for later use
threads = []
for f in generateFunctions(jobs):
    thr = threading.Thread(target = f, args=[])
    thr.start()
    threads.append(thr)