在Firefox中,可以右键单击图像并选择“复制图像位置”。这允许人们获得绝对图像路径,即使在图像的src属性中仅提供相对路径。 是否有可能以编程方式获得此绝对路径?它存放在哪里?
我使用Python3,请求访问该网站,美丽的汤来解析HTML。
答案 0 :(得分:0)
from bs4 import BeautifulSoup
from requests import get
url = 'https://example.com/'
response = get(url)
soup = BeautifulSoup(response.content, 'html.parser')
# converting to a set will prevent duplicates
images = set([img['src'] for img in soup.find_all('img') if hasattr(img, 'src')])
for img in images:
print(img)
如果图像使用相对路径(或外部主机,cdn等),我们可以使用下面的代码清除大部分。
注意:使用本地URI(file:///temp/web/img1.png
)
此代码使用validators
包,因此请安装pip install validators
from bs4 import BeautifulSoup
from requests import get
from os.path import join, normpath
import validators
url = 'https://example.com/'
response = get(url)
soup = BeautifulSoup(response.content, 'html.parser')
images = set([img['src'] for img in soup.find_all('img') if hasattr(img, 'src')])
list_of_img_paths = []
for img in images:
if not validators.url(url): # If NOT a valid URL
# Here we can assume we are dealing with a relative path
formatted_url = normpath(join(url, img)) # format a valid url
list_of_img_paths.append(formatted_url) # add to list
else:
list_of_img_paths.append(img)