BeautifulSoup 断链检查器/网络爬虫

时间:2021-04-12 22:31:48

标签: python beautifulsoup link-checking


但是,我在代码行上遇到了问题,因为当我运行程序时,我收到此错误消息:File "/Users/Documents/", line 26 print(f"Url: { link.get('href')} " + f"| 状态码:{response_code}") 语法错误:语法无效




# Import libraries
from bs4 import BeautifulSoup, SoupStrainer
import requests

# Prompt user to enter the URL
url = input("Enter your url: ")

# Make a request to get the URL
page = requests.get(url)

# Get the response code of given URL
response_code = str(page.status_code)

# Display the text of the URL in str
data = page.text

# Use BeautifulSoup to use the built-in methods
soup = BeautifulSoup(data)

# Iterate over all links on the given URL with the response code next to it
for link in soup.find_all('a'):
    print(f"Url: {link.get('href')} " + f"| Status Code: {response_code}")

1 个答案:

答案 0 :(得分:0)

您必须将附加参数 features="lxml"features="html.parser" 传递给 BeautifulSoup 构造函数。

soup = BeautifulSoup(data,features="html.parser")