我建立了一个异步程序,该程序将检查网站的多个路径上是否存在某个元素。 该程序具有一个基本url,它将获取要检查的域的不同路径,这些路径位于json文件(name.json)中。 如果我要查找的元素存在,则程序应打印出“ 1”。但是我很快意识到,它只选择检查json列表中的最后一项。
import json
import grequests
from bs4 import BeautifulSoup
idlist = json.loads(open('name.json').read())
baseurl = 'https://steamcommunity.com/id/'
for uid in idlist:
fullurl = baseurl + uid
rs = (grequests.get(fullurl) for uid in idlist)
resp = grequests.map(rs)
for r in resp:
soup = BeautifulSoup(r.text, 'lxml')
if soup.find('span', class_='actual_persona_name'):
print('1')
else:
print('2')
json文件仅包含一个随机数组来测试程序。
["xyz",
"sdasda9229",
"sdasda923229",
"sda",
"sda",
"sda",
"sd2",
"aaaaaa",
"aaaaaaaaa",
"aa2092425",
"aaaa23917"]
答案 0 :(得分:1)
在将ID附加到基本网址后,就不会存储该ID。您必须存储它并在构建get
请求时传递完整的URL
complete_urls = []
for uid in idlist:
fullurl = baseurl + uid
complete_urls.append(fullurl)
rs = (grequests.get(fullurl) for fullurl in complete_urls)