我正在尝试从房地产列表网站中提取数据。
我想要标题和价格,格式如下:
“迪拜最好的房产-230000迪拉姆”
这是我的代码:
from bs4 import BeautifulSoup
import mysql.connector
for x in range(10):
url = 'https://sharjah.dubizzle.com/en/property-for-sale/residential/apartment/?page='+str(x)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
#This wil find all titles.
quotes = soup.find_all('h2', class_='listTitle en line-clamp line-clamp-2')
#This will find all prices
prices = soup.find_all('div', class_='price')
for price in prices:
print(quotes.text)
print(price.text)
答案 0 :(得分:0)
您可以使用zip()
内置方法将标题,价格和其他一些数据“绑定”在一起。
例如:
from textwrap import shorten
from bs4 import BeautifulSoup
url = 'https://sharjah.dubizzle.com/en/property-for-sale/residential/apartment/?page={}'
print('{:^80} {:^15} {:^25}'.format('Title', 'Price', 'Place'))
for page in range(0, 2): # <--- Increase to number pages you want
response = requests.get(url.format(page))
soup = BeautifulSoup(response.text, 'lxml')
for title, price, place in zip(soup.select('.listItem .listTitle'),
soup.select('.listItem .price'),
soup.select('.listItem .place')):
print('{:<80} {:<15} {:<25}'.format(shorten(title.get_text(), 80), price.get_text(), place.get_text()))
打印:
Title Price Place
Spacious 2 Bed 3 Halls Apt in AL MAJAZ 3 Bukhamsen Tower AED 625,000 Al Majaz 3, Al Majaz
SEA VIEW, 2% REGISTRATION WAIVED , 1Y FREE SERVICE, AMAZING PYP AED 520,000 Al Mamzar
Best Deal | 2BR+Maid| Vacant on Transfer AED 720,000 Al Anwar Tower, Al Khan
Full Sea View | Large 4BR | Al Shahd AED 1,200,000 Al Shahd Tower, Al Khan
Full Sea View- Vacant 4BR+Maid with 2 Balcony AED 2,200,000 Al Majaz 3, Al Majaz
Vacant 3BR+Maid with Parking in Al Shahd-Sharjah AED 850,000 Al Shahd Tower, Al Khan
Investment Deal in Sharjah - Furnished 3 Bedroom AED 725,000 Al Taawun, Al Qasba
3BR+Maid |Lake view | Renovated Apartment Best Deal AED 850,000 Al Shahd Tower, Al Khan
Best Deal in Sharjah| 2 Bed in Al Shahd Tower AED 725,000 Al Shahd Tower, Al Khan
spacious 2 Bedroom For Sale in Al Marwa Tower 1 Al majaz AED 460,000 Al Marwa Tower 1, Al Majaz
... and so on.