我正在使用Python和Selenium刮擦一个网站。使用find_by_element
来查找我需要的所有值,但遇到了更多挑战。网站html将确切的结构显示为两个不同的值,我不能使用简单的find_element_by_class
,因为它们具有相同的类和ID。我不想使用xpath或选择器,因为我正在许多“ flight-row” div中进行迭代,这会使思想更加硬编码。
<div class="flight-row">
<div class="row row-eq-heights">
<div class="col-xs-4 col-md-4 no-padding"><span class="airline-name">gol</span><span class="flight-number">AM-477</span></div>
<div class="col-xs-4 col-md-4">
<div class="flight-timming"><span class="flight-time">06:15</span><span class="flight-destination">IAH</span></div><span class="flight-data">01/10/19</span></div>
<div class="col-xs-4 col-md-4 no-padding">
<div class="duration"><span class="flight-duration">21:25</span><span class="flight-stops" aria-label="Paradas do voo">2 paradas</span></div>
</div>
<div class="col-xs-4 col-md-4">
<div class="flight-timming"><span class="flight-destination">GIG</span><span class="flight-time">05:40</span></div><span class="flight-data">02/10/19</span></div>
</div>
</div>
我想从两个“ col-xs-4 col-md-4” div的飞行时间,飞行目的地和飞行数据中获取值。
这是我的一些代码:
outbound_flights = driver.find_elements_by_css_selector("div[class^='flight-item ']")
for outbound_flight in outbound_flights:
airline = outbound_flight.find_element_by_css_selector("span[class='airline-name']")
谢谢!
答案 0 :(得分:1)
您可以按索引获取值。
(//*[@class='flight-time'])[1]
和(//*[@class='flight-time'])[2]
答案 1 :(得分:1)
尝试使用以下CSS选择器获取flight-time
,flight-destination
和flight-data
outbound_flights = driver.find_elements_by_css_selector("div.col-xs-4.col-md-4:not(.no-padding)")
for outbound_flight in outbound_flights:
flight_time = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-time").text
print(flight_time)
flight_destination = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-destination").text
print(flight_destination)
flight_data = outbound_flight.find_element_by_css_selector("span.flight-data").text
print(flight_data)
06:15
IAH
01/10/19
05:40
GIG
02/10/19
已编辑答案:
outbound_flights = driver.find_elements_by_css_selector("div.col-xs-4.col-md-4:not(.no-padding)")
flighttime=[]
for outbound_flight in outbound_flights:
flight_time = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-time").text
print(flight_time)
flighttime.append(flight_time)
flight_destination = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-destination").text
print(flight_destination)
flight_data = outbound_flight.find_element_by_css_selector("span.flight-data").text
print(flight_data)
departure_time=flighttime[0]
arrival_time=flighttime[1]
print("Departure time :" + departure_time)
print("Arrival time :" + arrival_time)