Question

我正在使用Selenium。我想从html页面保存数组中的所有链接（部分链接（“https://instagram.com/p/”））。

我的代码如下所示：

src = browser.page_source
#here I get the html page
tag = src.findall("https://instagram.com/p/")  
tag = []
print(tag)

我想做这样的事情，但不知道怎么做。

Answer 1

试试这个

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://instagram.com/p/")

a_tag = driver.find_elements_by_xpath("//a[@href]")
links = [tag.get_attribute('href') for tag in a_tag]

print(links)

查找html文件中的所有链接并将其存储在数组中

1 个答案: