使用BeautifulSoup从网站上打印图像

时间:2020-06-05 00:53:32

标签: python web-scraping beautifulsoup

我正在尝试在python中使用BeautifulSoup,试图解析来自此网站的信息:

https://www.vogue.com/fashion/street-style

您将如何从网站上打印图像?例如,我尝试解析HTML中存在的徽标:

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.vogue.com/fashion/street-style').text

soup = BeautifulSoup(source, 'lxml')

logo = soup.find('img', class_='logo--image')
print(logo)

打印出来:

<img alt="Vogue Logo" class="logo--image" src="/img/vogue-logo-pride.svg"/>

如何在编辑器中打印实际图片?

1 个答案:

答案 0 :(得分:1)

快速摘要

不幸的是,现代编辑器/终端支持文本,而不支持图像。有一些替代方法:

  • 选项1:下载图像并定期打开。

  • 选项2:打开带有图片的新浏览器标签。

解决方案

选项1

from bs4 import BeautifulSoup
import requests
import shutil

WEBSITE = 'https://www.vogue.com/fashion/street-style'
SAVE_LOCATION = 'downloaded.svg'  # name of file to save image to

source = requests.get(WEBSITE).text

soup = BeautifulSoup(source, 'lxml')

relative_link_to_logo = soup.find('img', class_='logo--image')['src']

img = requests.get(f'{WEBSITE}/{relative_link_to_logo}', stream=True)
with open(SAVE_LOCATION,'wb') as f:
    for chunk in img:
        f.write(chunk)

选项2

from bs4 import BeautifulSoup
import requests
import webbrowser

WEBSITE = 'https://www.vogue.com/fashion/street-style'

source = requests.get(WEBSITE).text

soup = BeautifulSoup(source, 'lxml')

relative_link_to_logo = soup.find('img', class_='logo--image')['src']

webbrowser.open(f'{WEBSITE}/{relative_link_to_logo}')

最后的话

某些编辑器(例如Visual Studio Code)具有可让您预览图像文件(已下载到计算机)的插件,而无需离开编辑器。这仍然涉及选项1 ,并下载到磁盘。

希望对您有帮助!