Question

首先，我对python相对较新。我需要从网页中的文本中提取一个链接，我在Python 3.5中使用lxml，但我无法弄明白。这就是我到目前为止所做的：

url = someUrl
page = requests.get(url)
webpage = html.fromstring(page.content)
fulllinks = webpage.xpath('//a/@href')
fulltext = webpage.xpath('//a/text()')


for line in fulltext:
    if line.startswith("SomethingHere"):
    'get the link from SomethingHere and do other stuff'

其中"somethingHere"是文字，我想要该文字的链接（例如www.someweb.com.br/trends）。

我有点迷失在这里。提前谢谢。

Answer 1

得到了我想要的东西。答案是：

webpage.xpath("//a[starts-with(text(),'SomethingHere')]/@href")

非常感谢。

如何使用lxml xpath和python中的请求提取文本中的href

1 个答案: