我编写了一个脚本,用于从文件中提取URL并同时向所有URL发送HTTP请求。我现在想要限制会话中每秒的HTTP请求数和每个接口的带宽(eth0
,eth1
等)。有没有办法在Python上实现这个目标?
答案 0 :(得分:0)
您可以使用Semaphore对象,它是标准Python lib的一部分: python doc
或者如果您想直接使用线程,可以使用wait([timeout])。
没有与Python捆绑的库可以在以太网或其他网络接口上运行。你可以去的最低点就是套接字。
根据你的回复,这是我的建议。注意active_count。仅用于测试您的脚本只运行两个线程。那么在这种情况下,它们将是三个,因为第一个是您的脚本,然后您有两个URL请求。
import time
import requests
import threading
# Limit the number of threads.
pool = threading.BoundedSemaphore(2)
def worker(u):
# Request passed URL.
r = requests.get(u)
print r.status_code
# Release lock for other threads.
pool.release()
# Show the number of active threads.
print threading.active_count()
def req():
# Get URLs from a text file, remove white space.
urls = [url.strip() for url in open('urllist.txt')]
for u in urls:
# Thread pool.
# Blocks other threads (more than the set limit).
pool.acquire(blocking=True)
# Create a new thread.
# Pass each URL (i.e. u parameter) to the worker function.
t = threading.Thread(target=worker, args=(u, ))
# Start the newly create thread.
t.start()
req()
答案 1 :(得分:0)
您可以使用文档中描述的工作者概念: https://docs.python.org/3.4/library/queue.html
在worker中添加一个wait()命令,让它们在请求之间等待(在文档的示例中:" while true"在task_done之后)。
示例:5"工作人员" - 请求之间等待时间为1秒的线程将少于每秒5次提取。
答案 2 :(得分:0)
请注意,以下解决方案仍然可以按顺序发送请求,但会限制TPS(每秒的交易量)
TLDR; 有一个类可以统计当前秒内仍然可以拨打的电话数。每次拨打电话并每秒重新填充时,该费用都会减少。
import time
from multiprocessing import Process, Value
# Naive TPS regulation
# This class holds a bucket of tokens which are refilled every second based on the expected TPS
class TPSBucket:
def __init__(self, expected_tps):
self.number_of_tokens = Value('i', 0)
self.expected_tps = expected_tps
self.bucket_refresh_process = Process(target=self.refill_bucket_per_second) # process to constantly refill the TPS bucket
def refill_bucket_per_second(self):
while True:
print("refill")
self.refill_bucket()
time.sleep(1)
def refill_bucket(self):
self.number_of_tokens.value = self.expected_tps
print('bucket count after refill', self.number_of_tokens)
def start(self):
self.bucket_refresh_process.start()
def stop(self):
self.bucket_refresh_process.kill()
def get_token(self):
response = False
if self.number_of_tokens.value > 0:
with self.number_of_tokens.get_lock():
if self.number_of_tokens.value > 0:
self.number_of_tokens.value -= 1
response = True
return response
def test():
tps_bucket = TPSBucket(expected_tps=1) ## Let's say I want to send requests 1 per second
tps_bucket.start()
total_number_of_requests = 60 ## Let's say I want to send 60 requests
request_number = 0
t0 = time.time()
while True:
if tps_bucket.get_token():
request_number += 1
print('Request', request_number) ## This is my request
if request_number == total_number_of_requests:
break
print (time.time() - t0, ' time elapsed') ## Some metrics to tell my how long every thing took
tps_bucket.stop()
if __name__ == "__main__":
test()