BeatifulSoup努力抓取清单详细信息页面

时间:2020-06-27 13:27:28

标签: python web-scraping beautifulsoup

我仍然是Python世界的新秀。我正在尝试构建一个对我的日常工作有用的刮板。但是我被困在特定的地方:

我的目标是抓取房地产网站。我正在使用BeatifulSoup,并且设法在列表页面上获取参数没有问题。但是,当我进入“商品详情”页面时,我没有设法抓取任何数据。

我的代码:

from bs4 import BeautifulSoup
import requests

url = "https://timetochoose.co.ao/?search-listings=true"

headers = {'User-Agent': 'whatever'}

response = requests.get(url, headers=headers)

print(response)

data = response.text

print(data)

soup = BeautifulSoup(data, 'html.parser')

anuncios = soup.find_all("div", {"class": "grid-listing-info"})

for anuncios in anuncios:
    titles = anuncios.find("a",{"class": "listing-link"}).text
    location = anuncios.find("p",{"class": "location muted marB0"}).text
    link = anuncios.find("a",{"class": "listing-link"}).get("href")
    anuncios_response = requests.get(link)
    anuncios_data = anuncios_response.text
    anuncios_soup = BeautifulSoup(anuncios_data, 'html.parser')
    conteudo = anuncios_soup.find("div", {"id":"listing-content"}).text


    print("Título", titles, "\nLocalização", location, "\nLink", link, "\nConteudo", conteudo)

示例:“ conteudo”变量下没有任何内容。我试图从“详细信息”页面获取不同的数据,例如“价格”或“房间数”,但它总是失败,我只会得到“无”。

自昨天下午以来,我一直在寻找答案,但是我没有找到失败的地方。我设法在首页上获取参数没有问题,但是当我到达列表详细信息页面级别时,它只是失败了。

如果有人可以指出我做错了什么,我将不胜感激。预先感谢您花时间阅读我的问题。

1 个答案:

答案 0 :(得分:1)

要获取正确的页面,您需要设置User-Agent http标头。

例如:

import requests
from bs4 import BeautifulSoup


main_url = 'https://timetochoose.co.ao/?search-listings=true'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}


def print_info(url):
    soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
    print(soup.select_one('#listing-content').get_text(strip=True, separator='\n'))


soup = BeautifulSoup(requests.get(main_url, headers=headers).content, 'html.parser')
for a in soup.select('a.listing-featured-image'):
    print(a['href'])
    print_info(a['href'])
    print('-' * 80)

打印:

https://timetochoose.co.ao/listings/loja-rua-rei-katiavala-luanda/
Avenida brasil , Rua katiavala
Maculusso
Loja com 90 metros quadrados
2 andares
1 wc
Frente a estrada
Arrendamento  mensal 500.000 kz Negociável
--------------------------------------------------------------------------------
https://timetochoose.co.ao/listings/apertamento-t3-rua-cabral-montcada-maianga/
Apartamento T3 maianga
1  suíte com varanda
2 quartos com varanda
1 wc
1 sala comum grande
1 cozinha
Tanque de  agua
Predio limpo
Arrendamento 350.000  akz Negociável
--------------------------------------------------------------------------------

...and so on.
相关问题