如何每次都选择一个随机代理

时间:2020-09-12 16:49:33

标签: python selenium

我有一个脚本,该脚本可以将其连接到proxies.txt文件中的随机代理,我已验证该脚本可以成功连接,从而可以正常工作。但是,当代码每次运行时,我调用该函数时,它将连接到开始时选择的同一代理。我希望它每次调用时都能更改代理。

def get_single_proxy():
    proxy_list = [line.replace('\n', '') for line in open('proxies.txt', 'r')]
    proxy = random.choice(proxy_list)
    return proxy

PROXY = get_single_proxy()

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=3840x2160")
chrome_options.add_argument('--proxy-server=%s' % PROXY)

driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=chrome_driver)

async def start(ctx):
    driver.get(URL)
    print(PROXY)

更新: 遵循下面的建议,

class ProxyRotator:
    def __init__(self):
        #self.proxylist = [line.replace('\n', '') for line in open('proxies.txt', 'r')]
        self.proxyList = ['45.72.40.18:80', '45.130.127.12:80', '45.87.243.138:80']

    def get(self):
        """
        Optionally you could shuffle self.proxyList every X minutes or 
        after all proxies had been fetched once ...
        """
        proxy = self.proxyList.pop(0)
        self.proxyList.append(proxy)
        return proxy


pr = ProxyRotator()
for x in range(6):
    print(pr.get())
    
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=3840x2160")
chrome_options.add_argument('--proxy-server=%s' % pr)

driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=chrome_driver)

async def start(ctx):
    driver.get(URL)
    print(pr)

2 个答案:

答案 0 :(得分:0)

一种可能的解决方案是使用类存储实际的代理,然后始终在显示代码下面产生与实际的不同:

import random

class Proxy():
    def __init__(self):
        self.actual_proxy = None

    def get_single_proxy(self):
        proxy_list = [line.replace('\n', '') for line in open('proxies.txt', 'r')]
        if not proxy_list:
            raise ValueError("proxy_list is empty")
        while True:
            proxy = random.choice(proxy_list)
            if proxy != self.actual_proxy:
                self.actual_proxy = proxy
                break
        return proxy

p = Proxy()


for i in range(20):
    PROXY = p.get_single_proxy()
    print(f'rotate {i:02d} -> {PROXY}')

输出:

rotate 00 -> 50.50.50.50
rotate 01 -> 30.30.30.30
rotate 02 -> 90.90.90.90
rotate 03 -> 80.80.80.80
rotate 04 -> 70.70.70.70
rotate 05 -> 80.80.80.80
rotate 06 -> 40.40.40.40
rotate 07 -> 70.70.70.70
rotate 08 -> 30.30.30.30
rotate 09 -> 50.50.50.50
rotate 10 -> 80.80.80.80
rotate 11 -> 40.40.40.40
rotate 12 -> 10.10.10.10
rotate 13 -> 70.70.70.70
rotate 14 -> 40.40.40.40
rotate 15 -> 50.50.50.50
rotate 16 -> 80.80.80.80
rotate 17 -> 90.90.90.90
rotate 18 -> 20.20.20.20
rotate 19 -> 10.10.10.10

proxies.txt:

10.10.10.10
20.20.20.20
30.30.30.30
40.40.40.40
50.50.50.50
60.60.60.60
70.70.70.70
80.80.80.80
90.90.90.90

答案 1 :(得分:0)

您可以从列表中弹出第一个代理并将其附加到末尾:

class ProxyRotator:
    def __init__(self):
        # self.proxy_list = [line.replace('\n', '') for line in open('proxies.txt', 'r')]
        self.proxyList = ['proxy1', 'proxy2', 'proxy3']

    def get(self):
        """
        Optionally you could shuffle self.proxyList every X minutes or 
        after all proxies had been fetched once ...
        """
        proxy = self.proxyList.pop(0)
        self.proxyList.append(proxy)
        return proxy


pr = ProxyRotator()
for x in range(6):
    print(pr.get())

出局:

proxy1
proxy2
proxy3
proxy1
proxy2
proxy3