所以我用Proxy Broker抓取了一些代理。有时代理在被刮擦时会死掉,因此我想在使用它们之前先进行检查。因此,我使用Python Requests编写了一个程序来检查它们。在这里:
import time
import random
import requests
lines = open('not_checked.txt').read().splitlines()
check =random.choice(lines)
yaya = {
check
}
for x in range(0 , 10):
requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F': proxies=yaya)
r.status_code
print(status_code)
if status_code == 200:
f=open("checked_proxies.txt", "a+")
f.write(proxies)
else:
time.sleep(.001)
但是,这会抛出“设置对象没有属性get”的信息。我在网上查看了错误,并说这是因为我使用逗号而不是冒号。因此,我尝试了:
requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F': proxies=yaya)
获取语法错误。到底是怎么回事?
答案 0 :(得分:2)
proxies
必须是字典。是right there in the docs:
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
您的yaya
是set
而不是dict
。
答案 1 :(得分:1)
肯定会有逗号而不是分号。 样本片段
import time
import random
import requests
lines = open('proxies.txt').read().splitlines()
# check =random.choice(lines)
proxies = [
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
},
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
},
{
"http": "XXX.XXX.XXX.XXX:XXXX",
"https": "XXX.XXX.XXX.XXX:XXXX",
}
]
for proxy in proxies:
print("Requesting with %s and %s"%(proxy['http'], proxy['https']))
r = requests.get('https://reg.ebay.com/reg/PartialReg?ru=https%3A%2F%2Fwww.ebay.com%2F', proxies=proxy)
print("Loaded")
r.status_code
print(r.status_code)
if r.status_code == 200:
f=open("checked_proxies.txt", "a+")
f.write(proxy)
else:
time.sleep(.001)