我正在尝试通过使用美丽的汤汁从flipkart网站获取图片网址,并且出现键值错误。我尝试从alt src中存在的图片类标记中获取图片网址。
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.flipkart.com/men/shirts/casual-party-wear-shirts/prsid=2oq,s9b,mg4,vg6&p[]=facets.price_range.from%3DMin&p[]=facets.price_range.to%3D799&otracker=sp_browse_announcement_search.flipkart.com")
html = BeautifulSoup(r.text, 'lxml')
for img in html('img','_3togXc'):
print(img['alt src'])
期望的结果是获取图像网址
:src="https://rukminim1.flixcart.com/image/309/371/jtsz3bk0/shirt/p/n/r/3xl-twtblshirtful-sh4-tripr-original-imaffycxgppmkknv.jpeg?q=50"
...但是我收到键值错误。
答案 0 :(得分:0)
下面的代码将帮助您入门
import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(requests.get('https://matplotlib.org/tutorials/introductory/sample_plots.html').content)
# Using find gives first occurrence / use select
image_div = soup.find('div',{'class':'figure align-center'}) # Getting complete div element
image_tag = image_div.select('img ') # Getting image element
imageLink = image_tag[0]['src']
imageAlt = image_tag[0]['alt']
#Some Manipulations if required
imageLink = imageLink.replace("../../",'https://matplotlib.org/')
print(imageLink)
print(imageAlt)
引用一些有用的选择器