我正在构建一个数据集,因此我试图在Beautifulsoap的帮助下从网站下载多个图像。
但是当我打开相应的URL时,它引发了HTTP错误400:错误的请求。
我读了urllib.request
,但是我无法解决问题。
from urllib.request import urlopen
#url = "http://accionlabs.com/company/leadership-team/executive-leadership-team/index.html"
url = "https://www.kaufland.de/sortiment/unsere-marken/k-classic.html"
html = urlopen(url)
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for res in soup.findAll('img'):
print(res.get('src'))
list_var = url.split('/')
resource = urlopen(list_var[0]+"//"+list_var[2]+res.get('src'))
output = open(res.get('src').split('/')[-1],"wb")
output.write(resource.read())
output.close()