我找到了抓住其他网站的方法,但是对于这个代码,它需要一个特殊的浏览器"要访问html变量,事情是在我这样做之后,程序不会崩溃,但不再有效。
我想要的变量:排名,名称,代码,点数(https://imgur.com/a/FIWDFk1)
这是我制作的代码,但它在本网站上无效:[运行但没有读取/保存]
from urllib.request import urlopen as uReq
from urllib.request import Request
from bs4 import BeautifulSoup as soup
myUrl = "https://mee6.xyz/levels/159962941502783488"
req = Request(
myUrl,
data=None,
headers={
'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36'
}
)
uClient = uReq(req)
pageHtml = uClient.read()
uClient.close()
page_soup = soup(pageHtml, "html.parser")
containers = page_soup.findAll("div",{"class":"Player"})
print(containers)
我使用的代码来自youtube教程,当更改url时它不能与mee6排行榜一起工作,因为它拒绝了浏览器:[崩溃为mee6 url]
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import csv
my_url = "https://www.newegg.ca/Product/ProductList.aspx?Submit=ENE&N=100007708%20601210955%20601203901%20601294835%20601295933%20601194948&IsNodeId=1&bop=And&Order=BESTSELLING&PageSize=96"
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div",{"class":"item-container"})
filename = "GPU Prices.csv"
header = ['Price', 'Product Brand', 'Product Name', 'Shipping Cost']
with open(filename, 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(header)
for container in containers:
price_container = container.findAll("li", {"class":"price-current"})
price = price_container[0].text.replace('\xa0', ' ').strip(' –\r\n|')
brand = container.div.div.a.img["title"]
title_container = container.findAll("a", {"class":"item-title"})
product_name = title_container[0].text
shipping_container = container.findAll("li", {"class":"price-ship"})
shipping = shipping_container[0].text.strip()
csv_output.writerow([price, brand, product_name, shipping])
答案 0 :(得分:0)
尝试以下方法从该页面获取数据。网页会动态加载内容,因此requests
如果您坚持使用原始网址,则无法帮助您获取响应。使用开发工具来收集json链接,就像我在这里所做的那样。试一试:
import requests
URL = 'https://mee6.xyz/api/plugins/levels/leaderboard/159962941502783488'
res = requests.get(URL)
for item in res.json()['players']:
name = item['username']
discriminator = item['discriminator']
xp = item['xp']
print(name,discriminator,xp)
输出如下:
Sil 5262 891462
Birdie♫ 6017 745639
Delta 5728 641571
Mr. Squishy 0001 308349
Majick 6918 251024
Samuel (xCykrix) 1101 226470
WolfGang1710 6782 222741
要在csv文件中写入结果,您可以这样做:
import requests
import csv
Headers = ['Name','Discriminator','Xp']
res = requests.get('https://mee6.xyz/api/plugins/levels/leaderboard/159962941502783488')
with open('leaderboard.csv','w', newline='', encoding = "utf-8") as infile:
writer = csv.writer(infile)
writer.writerow(Headers)
for item in res.json()['players']:
name = item['username']
discriminator = item['discriminator']
xp = item['xp']
print(name,discriminator,xp)
writer.writerow([name,discriminator,xp])