Question

原始代码在这里： https://github.com/amitabhadey/Web-Scraping-Images-using-Python-via-BeautifulSoup-/blob/master/code.py

因此，我正在尝试采用Python脚本从网站收集图片，以便更好地进行网络抓取。

第一个错误是

导致此警告的代码在文件/Bureau/scrapper.py的第12行上。要消除此警告，请传递附加参数 'features =“ lxml”'到BeautifulSoup构造函数。

所以我做到了：

soup = BeautifulSoup(plain_text, features="lxml")

我还修改了该类以将标签以500px的大小反映出来。

但是现在脚本停止运行了，什么也没发生。

最后看起来像这样：

import requests 
from bs4 import BeautifulSoup 
import urllib.request
import random 

url = "https://500px.com/editors"

source_code = requests.get(url)

plain_text = source_code.text

soup = BeautifulSoup(plain_text, features="lxml")

for link in soup.find_all("a",{"class":"photo_link "}):
    href = link.get('href')
    print(href)

    img_name = random.randrange(1,500)

    full_name = str(img_name) + ".jpg"

    urllib.request.urlretrieve(href, full_name)

    print("loop break")

我做错了什么？

使用beautifulsoup刮取图像时出错

0 个答案: