如何从flipkart网站的div标签中存在的图片标签中获取图片网址?

时间:2019-05-02 06:39:13

标签: python web-scraping

我正在尝试通过使用美丽的汤汁从flipkart网站获取图片网址,并且出现键值错误。我尝试从alt src中存在的图片类标记中获取图片网址。

import requests

from bs4 import BeautifulSoup

r = requests.get("https://www.flipkart.com/men/shirts/casual-party-wear-shirts/prsid=2oq,s9b,mg4,vg6&p[]=facets.price_range.from%3DMin&p[]=facets.price_range.to%3D799&otracker=sp_browse_announcement_search.flipkart.com")

html = BeautifulSoup(r.text, 'lxml')

for img in html('img','_3togXc'):

print(img['alt src'])

期望的结果是获取图像网址

:src="https://rukminim1.flixcart.com/image/309/371/jtsz3bk0/shirt/p/n/r/3xl-twtblshirtful-sh4-tripr-original-imaffycxgppmkknv.jpeg?q=50" 

...但是我收到键值错误。

1 个答案:

答案 0 :(得分:0)

下面的代码将帮助您入门

import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(requests.get('https://matplotlib.org/tutorials/introductory/sample_plots.html').content)
# Using find gives first occurrence / use select
image_div = soup.find('div',{'class':'figure align-center'}) # Getting complete div element
image_tag = image_div.select('img ') # Getting image element
imageLink = image_tag[0]['src']
imageAlt = image_tag[0]['alt']
#Some Manipulations if required
imageLink = imageLink.replace("../../",'https://matplotlib.org/')
print(imageLink)
print(imageAlt)

也请在这里https://sites.google.com/view/way2learnings/programming-languages/python/python-libraries/beautifulsoup

引用一些有用的选择器