下载远程图像并将其保存到Django模型

时间:2013-04-23 15:59:04

标签: python django django-models

我正在编写一个Django应用程序,它将获取特定URL的所有图像并将其保存在数据库中。

但我没有讨论如何在Django中使用ImageField。

Settings.py

MEDIA_ROOT = os.path.join(PWD, "../downloads/")

# URL that handles the media served from MEDIA_ROOT. Make sure to use a
# trailing slash.
# Examples: "http://example.com/media/", "htp://media.example.com/"
MEDIA_URL = '/downloads/'

models.py

class images_data(models.Model):
        image_id =models.IntegerField()
        source_id = models.IntegerField()
        image=models.ImageField(upload_to='images',null=True, blank=True)
        text_ind=models.NullBooleanField()
        prob=models.FloatField()

download_img.py

def spider(site):
        PWD = os.path.dirname(os.path.realpath(__file__ ))
        #site="http://en.wikipedia.org/wiki/Pune"
        hdr= {'User-Agent': 'Mozilla/5.0'}
        outfolder=os.path.join(PWD, "../downloads")
        #outfolder="/home/mayank/Desktop/dreamport/downloads"
        print "MAYANK:"+outfolder
        req = urllib2.Request(site,headers=hdr)
        page = urllib2.urlopen(req)
        soup =bs(page)
        tag_image=soup.findAll("img")
        count=1;
        for image in tag_image:
                print "Image: %(src)s" % image
                filename = image["src"].split("/")[-1]
                outpath = os.path.join(outfolder, filename)
                urlretrieve('http:'+image["src"], outpath)
                im = img(image_id=count,source_id=1,image=outpath,text_ind=None,prob=0)
                im.save()
                count=count+1

我在一个视图中调用download_imgs.py,如

        if form.is_valid():
                url = form.cleaned_data['url']
                spider(url)

8 个答案:

答案 0 :(得分:37)

Django Documentation始终是开始的好地方

class ModelWithImage(models.Model):
    image = models.ImageField(
        upload_to='images',
    )

<强>已更新

所以这个脚本可以工作。

  • 循环图片下载
  • 下载图片
  • 保存到临时文件
  • 申请模特
  • 保存模型

import requests
import tempfile

from django.core import files

# List of images to download
image_urls = [
    'http://i.thegrindstone.com/wp-content/uploads/2013/01/how-to-get-awesome-back.jpg',
]

for image_url in image_urls:
    # Steam the image from the url
    request = requests.get(image_url, stream=True)

    # Was the request OK?
    if request.status_code != requests.codes.ok:
        # Nope, error handling, skip file etc etc etc
        continue

    # Get the filename from the url, used for saving later
    file_name = image_url.split('/')[-1]

    # Create a temporary file
    lf = tempfile.NamedTemporaryFile()

    # Read the streamed image in sections
    for block in request.iter_content(1024 * 8):

        # If no more file then stop
        if not block:
            break

        # Write image block to temporary file
        lf.write(block)

    # Create the model you want to save the image to
    image = Image()

    # Save the temporary image to the model#
    # This saves the model so be sure that is it valid
    image.image.save(file_name, files.File(lf))

一些参考链接:

  1. requests - “HTTP for Humans”,我更喜欢urllib2
  2. tempfile - 保存临时文件而不是磁盘
  3. Django filefield save

答案 1 :(得分:17)

如果您想保存下载的图像而不先将它们保存到磁盘(不使用NamedTemporaryFile等),那么可以轻松实现。

这比下载文件并将其写入磁盘稍快一些,因为它全部在内存中完成。请注意,此示例是为Python 3编写的 - 该过程在Python 2中类似,但略有不同。

from django.core import files
from io import BytesIO
import requests

url = "https://example.com/image.jpg"
resp = requests.get(url)
if resp.status_code != requests.codes.ok:
    #  Error handling here

fp = BytesIO()
fp.write(resp.content)
file_name = url.split("/")[-1]  # There's probably a better way of doing this but this is just a quick example
your_model.image_field.save(file_name, files.File(fp))

your_model是您要保存的模型的实例,而.image_fieldImageField的名称。

有关详细信息,请参阅io的文档。

答案 2 :(得分:1)

作为我认为你问的一个例子:

在forms.py中:

imgfile = forms.ImageField(label = 'Choose your image', help_text = 'The image should be cool.')

在models.py中:

imgfile =   models.ImageField(upload_to='images/%m/%d')

因此将有来自用户的POST请求(当用户完成表单时)。该请求基本上包含数据字典。字典保存提交的文件。要将请求集中在字段中的文件(在我们的示例中为ImageField),您可以使用:

request.FILES['imgfield']

在构造模型对象(实例化模型类)时,可以使用它:

newPic = ImageModel(imgfile = request.FILES['imgfile'])

为了保存简单的方法,你只需使用赋予你对象的save()方法(因为Django非常棒):

if form.is_valid():
    newPic = Pic(imgfile = request.FILES['imgfile'])
    newPic.save()

默认情况下,您的图像将存储到您在settings.py中为MEDIA_ROOT指定的目录。

访问模板中的图片:

<img src="{{ MEDIA_URL }}{{ image.imgfile.name }}"></img>

网址可能很棘手,但这里是调用存储图片的简单网址模式的基本示例:

urlpatterns += patterns('',
        url(r'^media/(?P<path>.*)$', 'django.views.static.serve', {
            'document_root': settings.MEDIA_ROOT,
        }),
   )

我希望它有所帮助。

答案 3 :(得分:0)

尝试这样做而不是为图像指定路径......

    import urllib2
    from django.core.files.temp import NamedTemporaryFile
    def handle_upload_url_file(url):
        img_temp = NamedTemporaryFile()
        opener = urllib2.build_opener()
        opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120427 Firefox/15.0a1')]
        img_temp.write(opener.open(url).read())
        img_temp.flush()
        return img_temp

使用上面这样的函数..

    new_image = images_data()
    #rest of the data in new_image and then do this.
    new_image.image.save(slug_filename,File(handle_upload_url_file(url)))
    #here slug_filename is just filename that you want to save the file with.

答案 4 :(得分:0)

类似于@ boltfrombluesky上面的回答,你可以在Python 3中做到这一点,没有任何外部依赖,如下所示:

from os.path import basename
import urllib.request
from urllib.parse import urlparse
import tempfile

from django.core.files.base import File

def handle_upload_url_file(url, obj):
    img_temp = tempfile.NamedTemporaryFile(delete=True)
    req = urllib.request.Request(
        url, data=None,
        headers={
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
        }
    )
    with urllib.request.urlopen(req) as response:
        img_temp.write(response.read())
    img_temp.flush()
    filename = basename(urlparse(url).path)
    result = obj.image.save(filename, File(img_temp))
    img_temp.close()
    return result

答案 5 :(得分:0)

如果您通过覆盖模型的保存方法来保存图像以修改文件名并在 Django 中与随机无效文件名(如我一样)苦苦挣扎。您可以跟进以下代码(从接受的答案中复制):

lf = tempfile.NamedTemporaryFile()


for block in response.iter_content(1024*8):

        if not block:
            break

        lf.write(block)
    lf.name = name.  # Set your custom file name here
    dc = ImageFile(file=files.File(lf))

    dc.file.save()

我已经用 django-storages 配置了我的存储,以便直接将媒体内容上传到 s3。由于某些原因,我无法替换文件名。经过一些研发,它奏效了。

注意:我在模型中使用了 FileField,因此不需要几行代码

答案 6 :(得分:0)

# this is my solution
from django.core import files
from django.core.files.base import ContentFile

import requests
from .models import MyModel

def download_img():
    r = requests.get("remote_file_url", allow_redirects=True)
    filename = "remote_file_url".split("/")[-1]

    my_model = MyModel(
        file=files.File(ContentFile(r.content), filename)
    )
    my_model.save()

    return

答案 7 :(得分:-2)

def qrcodesave(request): 
    import urllib2;   
    url ="http://chart.apis.google.com/chart?cht=qr&chs=300x300&chl=s&chld=H|0"; 
    opener = urllib2.urlopen(url);  
    mimetype = "application/octet-stream"
    response = HttpResponse(opener.read(), mimetype=mimetype)
    response["Content-Disposition"]= "attachment; filename=aktel.png"
    return response