Question

我正在尝试从范围内的href网页抓取文本，因为我想遍历表并在框中找到文本，所以无法通过xpath访问该文本。

Here is a screenshot of what I want to find

Answer 1

我认为你可以做到

import urllib.request

page_url = #Insert your url here

with urllib.request.urlopen(page_url) as f:
    html = f.read().decode('utf-8')
    html.find(#whatever)

我认为它应该返回页面的整个html，然后您就可以根据需要进行抓取了。

Answer 2

根据您提供的信息，您想打印文本，例如在本例中：“3i Group PLC”。我只是将 <a href="stock-info.php?ticker=III:LN">3i Group PLC</a> 的 xpath 链接称为 link_abc，因为 xpath 链接不是您提供的。

这是使用硒库的代码：

name = driver.find_element_by_xpath("link_abc")
print(name.text)

输出应该是 3i Group PLC。

如何从href内的作用域（硒，python）访问文本

2 个答案: