我正在尝试抓取某些网页,但遇到的问题是页面内容与我在Firefox中看到的内容不同
这是我的代码:
import requests
from bs4 import BeautifulSoup
url = "https://www.sareb.es/es_ES/inmuebles"
with requests.get(url, verify = False) as html_file:
soup = BeautifulSoup(html_file.content, "html.parser")
soup.find_all("h3")
我想取消价格,这些价格在h3标签中,但是输出未显示soup.find_all("h3")
。
有什么方法可以检索“相同”网页? 谢谢
答案 0 :(得分:1)
您可以将请求用于json响应,使用页码创建循环以获得更多结果。总页数为1217。
import requests
url = "https://www.sareb.es/dynamic/assets/json?"
params = {
"lang": "en_US",
"page": "1",
"orderField": "score",
"orderDirection": "DESC",
"compId": "7aa1f42482964610VgnVCMServera5ecbf0aRCRD",
"rtbPage": "Home > Inmuebles",
"compId": "7aa1f42482964610VgnVCMServera5ecbf0aRCRD"
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36",
"X-CSRF-TOKEN": "5618a1c8-9d8e-4a88-95a8-2eef4c2b3455",
"X-Requested-With": "XMLHttpRequest"
}
r = requests.get(url, params=params, headers=headers, verify=False)
d = r.json()
results = (d['result']['assetsPage']['content'])
for result in results:
print(result['type'], result['price'], result['city'], result['district'])
结果:
Country House 281.000 € Sueca Valencia/València
Country House To consult Moaña Pontevedra
Country House 152.000 € Corcos Valladolid
Country House 36.000 € Alcalà de Xivert Castellón/Castelló
Office 130.800 € Sagunto/Sagunt Valencia/València
Office 643.370 € Valencia Valencia/València
Office 646.495 € Valencia Valencia/València
Office 3.100 € Valencia Valencia/València
Office 144.700 € / 1.070 € Palmas de Gran Canaria (Las) Palmas, Las
Office 326.400 € / 1.635 € Murcia Murcia
Offices From 60.000 € Colmenar Viejo Madrid
Office 519.300 € / 2.564 € Alicante/Alacant Alicante/Alacant
Office To consult Palma de Mallorca Balears, Illes
Office 97.000 € Santa Lucía de Tirajana Palmas, Las
Offices From 444.200 € Sevilla Sevilla
Offices From 34.700 € Villamayor Salamanca