我正在使用selenium webdriver搜索并从此网站https://www.idealista.com/venta-viviendas/marbella-malaga/
中提取数据我希望得到一张表格,其中包含每个房屋的价格(class item_price),房间数量(class item_detail)和平方米(class item_detail)。
我相信我必须使用driver.find_elements()方法,但我不知道在哪里添加它以及如何确保我们在一个包含三列的表中添加所有价格,房间和平方米
到目前为止,我已经获得了此代码,但它无法正常工作。我看到FireFox通过页面,但它似乎并没有保存并存储在houses.csv中的数据。有人可以帮忙吗?感谢
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
import pandas as pd
import csv
import time
driver = webdriver.Firefox()
driver.get("https://www.idealista.com/venta-viviendas/marbella/las-chapas-el-rosario/")
pages_remaining = True
price = []
rooms = []
size = []
while pages_remaining:
price = driver.find_elements_by_class_name("item-price")
rooms = driver.find_elements_by_xpath("//*[contains(text(), 'hab.')]")
size = driver.find_elements_by_xpath("//*[contains(text(), 'm²')]")
houses = [price, rooms, size]
try:
# Checks if there are more pages with links
next_link = driver.find_element_by_class_name("icon-arrow-right-after")
next_link.click()
time.sleep(10)
except NoSuchElementException:
rows_remaining = False
with open('houses.csv', 'wb') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerows(houses)
print(houses)