限制并行工作的线程数

时间:2017-02-11 08:21:43

标签: python multithreading paramiko python-multithreading

正在创建一个函数将文件从本地机器复制到远程创建线程以并行执行sftp

def copyToServer():
    //does  copy file given host name and credentials

for i in hostsList:
    hostname = i
    username = defaultLogin
    password = defaultPassword
    thread = threading.Thread(target=copyToServer, args=(hostname, username, password, destPath, localPath))
    threadsArray.append(thread)
    thread.start()

这会创建线程并开始并行复制但我想限制它一次处理50个线程,因为服务器总数可能太多

3 个答案:

答案 0 :(得分:6)

您需要调整代码以共享并跟踪常用值。

这可以使用Semaphore Object来完成。该对象拥有一个内部计数器,每个线程都试图获取它。如果计数器大于您定义的最大值,则线程无法获取一个,并且将被阻止,直到一个空闲。

一个简短的示例显示并行最多5个线程,一半线程立即执行,其他线程被阻塞并等待:

import threading
import time

maxthreads = 5
sema = threading.Semaphore(value=maxthreads)
threads = list()

def task(i):
    sema.acquire()
    print "start %s" % (i,)
    time.sleep(2)
    sema.release()

for i in range(10):
    thread = threading.Thread(target=task,args=(str(i)))
    threads.append(thread)
    thread.start()

输出

start 0
start 1
start 2
start 3
start 4

并在几秒钟后完成第一个线程,执行下一个线程

start 5
start 6
start 7
start 8
start 9

答案 1 :(得分:0)

#!/usr/bin/python
# -*- coding: utf-8 -*-
import time
from threading import Lock, Thread, active_count
from random import uniform # get some random time

thread_list = []
names = ['Alfa', ' Bravo', ' Charlie', ' Delta', ' Echo', ' Foxtrot', ' Golf', ' Hotel', ' India', ' Juliett', ' Kilo', ' Lima']
#-------------------------------------------------------------------------

def testFunction(inputName):
    waitTime = uniform(0.987, 2.345) # Random time between 0.987 and 2.345 seconds
    time.sleep(waitTime)
    print ('Finished working on name: ' + inputName)
#-------------------------------------------------------------------------

n_threads = 4 # define max child threads. 
for list_names in names:

    print ( 'Launching thread with name: ' + list_names )
    t = Thread(target=testFunction, args=(list_names,))
    thread_list.append(t)
    t.start()

    while active_count() > n_threads: # max thread count (includes parent thread)
        print ( '\n == Current active threads ==: ' + str(active_count()-1) )
        time.sleep(1) # block until active threads are less than 4

for ex in thread_list: # wait for all threads to finish
    ex.join()
#-------------------------------------------------------------------------
print ( '\n At this point we continue on main thread \n' )

这应该给你类似的东西 this

# time ./threads.py
Launching thread with name: Alfa
Launching thread with name:  Bravo
Launching thread with name:  Charlie
Launching thread with name:  Delta

== Current active threads ==: 4

== Current active threads ==: 4
Finished working on name:  Bravo
Finished working on name:  Delta
Finished working on name: Alfa
Finished working on name:  Charlie
Launching thread with name:  Echo
Launching thread with name:  Foxtrot
Launching thread with name:  Golf
Launching thread with name:  Hotel

== Current active threads ==: 4

== Current active threads ==: 4
Finished working on name:  Hotel
Finished working on name:  Foxtrot
Launching thread with name:  India
Launching thread with name:  Juliett

== Current active threads ==: 4
Finished working on name:  Echo
Finished working on name:  Golf
Launching thread with name:  Kilo
Launching thread with name:  Lima

== Current active threads ==: 4
Finished working on name:  India
Finished working on name:  Juliett
Finished working on name:  Lima
Finished working on name:  Kilo

At this point we continue on main thread


real    0m6.945s
user    0m0.034s
sys     0m0.009s

答案 2 :(得分:0)

对于那些寻求“快速修复”解决方案以限制python3中“线程”模块中线程数量的用户 -基本逻辑是将主函数包装到包装器中,然后调用包含stop / go逻辑的包装器。

下面这是Andpei提出的重用解决方案,但是他帖子中的逐字代码不起作用,下面对我有用的修改是

Python3:

import threading
import time

maxthreads = 3
smphr = threading.Semaphore(value=maxthreads)
threads = list()

SomeInputCollection=("SomeInput1","SomeInput2","SomeInput3","SomeInput4","SomeInput5","SomeInput6")

def yourmainfunction(SomeInput):
    #main function
    print ("Your input was: "+ SomeInput)

def task(SomeInput):
    #yourmainfunction wrapped in a task
    print(threading.currentThread().getName(), 'Starting')
    smphr.acquire()
    yourmainfunction(SomeInput)
    time.sleep(2)
    print(threading.currentThread().getName(), 'Exiting')
    smphr.release()


def main():
    threads = [threading.Thread(name="worker/task", target=task, args=(SomeInput,)) for SomeInput in SomeInputCollection]
    for thread in threads:
        thread.start()
    for thread in threads:
        thread.join()
if __name__== "__main__":
  main()

输出:

worker/task Starting
Your input was: SomeInput1
worker/task Starting
Your input was: SomeInput2
worker/task Starting
Your input was: SomeInput3
worker/task Starting
worker/task Starting
worker/task Starting
worker/task Exiting
Your input was: SomeInput4
worker/task Exiting
worker/task Exiting
Your input was: SomeInput6
Your input was: SomeInput5
worker/task Exiting
worker/task Exiting
worker/task Exiting