我对Python有点陌生,我试图了解如何从下面的代码中提取'title='
属性。我一直在尝试使用beautifulsoup,但老实说,任何对我都有益的东西。
<a class="image-link" href="/new-jersey/communities/holiday-city-at-berkeley" title="Holiday City at Berkeley"><div class="lazyload pulse out exited" style="height:auto"><div class="placeholder"><svg class="svg-placeholder-component" height="100%" viewbox="0 0 400 225" width="100%"><use xlink:href="#lazyload-placeholder"></use></svg></div></div></a>
我尝试了all[0].find_all('a', "title")
和all[0].find_all("title")
,但都返回了'[]'
。
<a class="image-link" href="/new-jersey/communities/holiday-city-at-berkeley" title="Holiday City at Berkeley"><div class="lazyload pulse out exited" style="height:auto"><div class="placeholder"><svg class="svg-placeholder-component" height="100%" viewbox="0 0 400 225" width="100%"><use xlink:href="#lazyload-placeholder"></use></svg></div></div></a>
答案 0 :(得分:1)
您可以使用CSS选择器提取所需的元素:
from bs4 import BeautifulSoup
html = '<a class="image-link" href="/new-jersey/communities/holiday-city-at-berkeley" title="Holiday City at Berkeley"><div class="lazyload pulse out exited" style="height:auto"><div class="placeholder"><svg class="svg-placeholder-component" height="100%" viewbox="0 0 400 225" width="100%"><use xlink:href="#lazyload-placeholder"></use></svg></div></div></a>'
soup = BeautifulSoup(html, 'lxml')
for a in soup.select('a[title]'):
print(a['title'])
打印:
Holiday City at Berkeley
答案 1 :(得分:0)
您可以尝试如下提取@title
:
links = soup.findAll(attrs={"class" : "image-link"})
for link in links:
print(link["title"])