简单解决方案

Question

在Firefox中，可以右键单击图像并选择“复制图像位置”。这允许人们获得绝对图像路径，即使在图像的src属性中仅提供相对路径。是否有可能以编程方式获得此绝对路径？它存放在哪里？

我使用Python3，请求访问该网站，美丽的汤来解析HTML。

Answer 1

简单解决方案

from bs4 import BeautifulSoup
from requests import get

url = 'https://example.com/'
response = get(url)
soup = BeautifulSoup(response.content, 'html.parser')

# converting to a set will prevent duplicates
images = set([img['src'] for img in soup.find_all('img') if hasattr(img, 'src')])

for img in images:
    print(img)

扩展解决方案

如果图像使用相对路径（或外部主机，cdn等），我们可以使用下面的代码清除大部分。

注意：使用本地URI（file:///temp/web/img1.png）

时无效

此代码使用validators包，因此请安装pip install validators

from bs4 import BeautifulSoup
from requests import get
from os.path import join, normpath
import validators

url = 'https://example.com/'
response = get(url)
soup = BeautifulSoup(response.content, 'html.parser')

images = set([img['src'] for img in soup.find_all('img') if hasattr(img, 'src')])

list_of_img_paths = []

for img in images:
    if not validators.url(url):  # If NOT a valid URL
        # Here we can assume we are dealing with a relative path
        formatted_url = normpath(join(url, img))  # format a valid url
        list_of_img_paths.append(formatted_url)  # add to list
    else:
        list_of_img_paths.append(img)

如何在网站上获取图像的绝对路径

1 个答案:

简单解决方案

扩展解决方案