Question

我需要在与此类似的HTML代码中找到一个图像：

...
<a href="/example/11/1"> 
    <img src="http://example.net/example.jpg" alt="Example"/>
</a>
...

并在src中下载图片。

Answer 1

这应该是一个很好的起点：

import urllib2
from BeautifulSoup import BeautifulSoup

page = urllib2.urlopen('http://yahoo.com').read()
soup = BeautifulSoup(page)
counter = 0
for img in soup.find_all('img'):
    with open("image" + str(counter),'wb') as f:
        f.write(urllib2.urlopen(img['src']).read())
    counter += 1

Answer 2

这有助于找到图像文件的来源。

from urllib.request import urlopen, Request
from bs4 import BeautifulSoup

word = 'pizza'

user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0'

url = "https://www.bing.com/images/search?q=ciling+fan&qs=n&form=QBILPG&sp=-1&ghc=1&pq=ciling+fan&sc=8-4&sk=&cvid=73D78D239D574921A293EF9725CD2F65"
headers={'User-Agent':user_agent,} 

request=Request(url,None,headers) #The assembled request
response =urlopen(request)
soup = BeautifulSoup(response,'html.parser')
counter = 0

for ul in soup.find_all('ul',{'class':'dgControl_list '}):
    for li in soup.find_all('li'):
        for images in soup.find_all('div',{'class':'img_cont hoff'}):
            s = images.find('img')
            img = s.get('data-src')
            if img != None:
                with open("Z:\pyimages\image" + str(counter) +".jpeg",'wb') as f:
                    f.write(urlopen(img).read())
                    counter += 1

使用Python下载图像

2 个答案: