我正在尝试浏览PADI船上页面,以刮取一些船只,出发日期和价格信息。我能够从chrome调试控制台获取xpath,并让selenium找到它。但是我想通过使用相对路径来使其更好,但我不确定该怎么做。这是到目前为止我得到的:
from selenium import webdriver
import pdb
browser = webdriver.Chrome()
browser.get('https://travel.padi.com/s/liveaboards/caribbean/')
assert 'Caribbean' in browser.title
elem2 = browser.find_elements_by_xpath('//*[@id="search-la"]/div/div[3]/div/div[2]/div[3]/div')
print(elem2)
print(len(elem2))
browser.close()
因此,如您所见,代码将发送至PADI,找到每艘潜水船的所有卡,并将其归还给我。这里使用的xpath来自可用的最近ID,但从那一点开始,它是所有绝对路径div / div / div等。我想知道是否可以将其更改为相对路径。
谢谢。
答案 0 :(得分:2)
您应该使用class
和/或id
来缩短xpath
。
找到cards
时,您可以使用以xpath
开头的./
的每张卡片-因此相对于该元素它将是xpath
,并且只会搜索在这个元素里面。
您还可以在//
的任何部分使用xpath
来跳过一些不重要的标签。
您可以将其他find_element_by_
和find_elements_by_
与card
一起使用,并且它也只会在该元素内搜索-它将是相对的。
import selenium.webdriver
driver = selenium.webdriver.Chrome() # Firefox()
driver.get('https://travel.padi.com/s/liveaboards/caribbean/')
all_cards = driver.find_elements_by_xpath('//div[@class="boat search-page-item-card "]')
for card in all_cards:
title = card.find_element_by_xpath('.//a[@class="shop-title"]/span')
desc = card.find_element_by_xpath('.//p[@class="shop-desc-text"]')
price = card.find_element_by_xpath('.//p[@class="cur-price"]/strong/span')
print('title:', title.text)
print('desc:', desc.text)
print('price:', price.text)
all_dates = card.find_elements_by_css_selector('.cell.date')
for date in all_dates:
day, month = date.find_elements_by_tag_name('span')
print('date:', day.text, month.text)
print('---')
示例结果(您可以使用其他货币表示价格)
title: CARIBBEAN EXPLORER II
desc: With incredible, off-the-beaten path itineraries that take guests to St Kitts, Saba and St Maarten, this leading liveaboard spoils divers with five dives each day, scenic geography and a unique slice of Caribbean culture.
Dates do not match your search criteria
price: PLN 824
date: 7 DEC
date: 14 DEC
date: 21 DEC
date: 28 DEC
---
title: BAHAMAS AGGRESSOR
desc: Featuring five dives a day, the well-regarded Bahamas Aggressor liveaboard is the ideal choice for divers who want to spend as much time under the water as possible then relax in an onboard Jacuzzi.
Dates do not match your search criteria
price: PLN 998
date: 7 DEC
date: 14 DEC
date: 21 DEC
date: 28 DEC
---
答案 1 :(得分:0)
您需要在项目classes
中使用./
我只是为您编码,您可以尝试!
from selenium import webdriver
import pdb
browser = webdriver.Chrome()
browser.get('https://travel.padi.com/s/liveaboards/caribbean/')
items = browser.find_elements_by_xpath('//div[@class="boat-info"]')
for item in items :
title = item.find_element_by_xpath('.//a[@class="shop-title"]/span')
description = item.find_element_by_xpath('.//p[@class="shop-desc-text"]')
price = item.find_element_by_xpath('.//p[@class="cur-price"]/strong/span')
print('TITLE: ', title.text)
print('DESCRIPTION: ', description.text)
print('PRICE: ', price.text)
print('------------------NEW-RECORD------------------------')