Question

I'm working with beautiful soup and am trying to grab the first tag on a page that has the attribute equal to a certain string.

For example:

<a href="url" title="export"></a>

What I've been trying to do is grab the href of the first that is found whose title is "export".

If I use soup.select("a[title='export']") then I end up finding all tags who satisfy this requirement, not just the first.
If I use find("a", {"title":"export"}) with conditions being set such that the title should equal "export", then it grabs the actual items inside the tag, not the href.
If I write .get("href") after calling find(), I get None back.

I've been searching the documentation and stack overflow for an answer but have yet found one. Does anyone know a solution to this? Thank you!

Answer 1

我一直试图做的就是抓住第一个找到标题为＆＃34; export＆＃34;。
的href。

你几乎就在那里。您需要做的就是，一旦您获得了标签，您就需要将其编入索引以获得href。这是一个更加防弹的版本：

try:
    url = soup.find('a', {title : 'export' })['href']
    print(url)
except TypeError:
    pass

Answer 2

按照 html 文件中的相同主题，我只想从 HTML 标签中找到专利号、引文标题。我试过了，但它打印了 HTML 文件中的所有标题，但我特别希望它只在引文下。

url = 'https://patents.google.com/patent/EP1208209A1/en?oq=medicinal+chemistry'
patent = html_file.read() 
#print(patent)
soup = BeautifulSoup(patent, 'html.parser')
x=soup.select('tr[itemprop="backwardReferences"]')
 y=soup.select('td[itemprop="title"]')
print(y)```

美丽的汤首先找到<a> whose title attribute equal a certain string

2 个答案: