Question

我正在尝试抓取该网站的“列表键规格”：

https://www.autotrader.co.uk/car-search?radius=30&postcode=ss156ee&onesearchad=Used&make=Renault&model=zoe&page=1

但是我只对里程规格感兴趣，而不对bhp或任何其他规格感兴趣。

如果我输入

specs=article.find('ul',class_="listing-key-specs")
print(specs.text)

我可能会获得6条信息：

2015 (65 reg)
Hatchback
13,033 miles
88bhp
Automatic
Electric**

如果我输入

print(specs.li.text)

我只会得到第一个规格，即

2015（65 reg）

如何选择特定的规格？比方说“英里”规格？

Answer 1

您可以提取第一个子li

from bs4 import BeautifulSoup as bs
import requests
res= requests.get('https://www.autotrader.co.uk/car-search?radius=30&postcode=ss156ee&onesearchad=Used&make=Renault&model=zoe&page=1')
soup = bs(res.content, 'lxml')
details = [item.text for item in soup.select('.listing-key-specs li:first-child')]
print(details)

效率低下

.listing-key-specs li:nth-of-type(1)

或

.listing-key-specs :nth-child(1)

或

.listing-key-specs li:first-of-type

我正在使用最新的BeautifulSoup 4.7.1

Answer 2

或者简单地：

print(specs('li')[2].text)

输出：

15,285 miles

如何在使用python和Beautiful soup抓取时访问网站中的同级元素

2 个答案: