Wagtail文档:大文件大小(> 2GB)上传失败

时间:2019-05-13 02:25:02

标签: python django wagtail

我正在尝试使用Wagtail应用程序中的内置wagtaildocs应用程序上传文件。我已经用Nginx的Digital Ocean教程方法设置了我的Ubuntu 16.04服务器。独角兽| Postgres

一些初步澄清:

  1. 在我的Nginx配置中,我设置了foo <- function(d, per, ...){ ## Add a new argument called `extract` out <- data.frame(d, ...) h <- split(out, rep(seq_along(per), per)) ## `extract` should subset from `h` return(h) } # Example of use: foo(d = 2:4, per = 1:2, age = 1:3, prof = c("med", "low", "med")) 10000M;
  2. 在生产设置中,我有以下几行: client_max_body_size
  3. 我的文件类型为MAX_UPLOAD_SIZE = "5242880000" WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
  4. 这是此时的生产测试。我只实现了基本的w应用程序,而没有其他模块。

从我的配置角度来看,除非我的文件大小低于10Gb,否则我应该没事,除非我遗漏了某些东西或对错字视而不见。

我已经尝试过将所有配置值调整为不合理的大值。我尝试使用其他文件扩展名,但不会更改我的错误。

我认为这与会话期间关闭TCP或SSL连接有关。我以前从未遇到过此问题,因此希望获得一些帮助。

这是我的错误消息:

.zip

这是我的设置

Internal Server Error: /admin/documents/multiple/add/
Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.DatabaseError: SSL SYSCALL error: Operation timed out


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/urls/__init__.py", line 102, in wrapper
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/decorators.py", line 34, in decorated_view
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/utils.py", line 151, in wrapped_view_func
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/vary.py", line 20, in inner_func
    response = func(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/documents/views/multiple.py", line 60, in add
    doc.save()
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 741, in save
    force_update=force_update, update_fields=update_fields)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 779, in save_base
    force_update, using, update_fields,
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 870, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 908, in _do_insert
    using=using, raw=raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
    cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute
    return super().execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.DatabaseError: SSL SYSCALL error: Operation timed out

这是我的Nginx设置

### base.py ###
import os

PROJECT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
BASE_DIR = os.path.dirname(PROJECT_DIR)
SECRET_KEY = os.getenv('SECRET_KEY_WAGTAILDEV')

# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/


# Application definition

INSTALLED_APPS = [
    'home',
    'search',

    'wagtail.contrib.forms',
    'wagtail.contrib.redirects',
    'wagtail.embeds',
    'wagtail.sites',
    'wagtail.users',
    'wagtail.snippets',
    'wagtail.documents',
    'wagtail.images',
    'wagtail.search',
    'wagtail.admin',
    'wagtail.core',

    'modelcluster',
    'taggit',

    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'storages',
]

MIDDLEWARE = [
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'django.middleware.security.SecurityMiddleware',

    'wagtail.core.middleware.SiteMiddleware',
    'wagtail.contrib.redirects.middleware.RedirectMiddleware',
]

ROOT_URLCONF = 'wagtaildev.urls'

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': [
            os.path.join(PROJECT_DIR, 'templates'),
        ],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
            ],
        },
    },
]

WSGI_APPLICATION = 'wagtaildev.wsgi.application'


# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'HOST': os.getenv('DATABASE_HOST_WAGTAILDEV'),
        'USER': os.getenv('DATABASE_USER_WAGTAILDEV'),
        'PASSWORD': os.getenv('DATABASE_PASSWORD_WAGTAILDEV') ,
        'NAME': os.getenv('DATABASE_NAME_WAGTAILDEV'),
        'PORT': '5432',
    }
}


# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
    {
        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
    },
]


# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'UTC'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATICFILES_FINDERS = [
    'django.contrib.staticfiles.finders.FileSystemFinder',
    'django.contrib.staticfiles.finders.AppDirectoriesFinder',
]

STATICFILES_DIRS = [
    os.path.join(PROJECT_DIR, 'static'),
]

# ManifestStaticFilesStorage is recommended in production, to prevent outdated
# Javascript / CSS assets being served from cache (e.g. after a Wagtail upgrade).
# See https://docs.djangoproject.com/en/2.2/ref/contrib/staticfiles/#manifeststaticfilesstorage
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'

MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
MEDIA_URL = '/media/'


# Wagtail settings

WAGTAIL_SITE_NAME = "wagtaildev"

# Base URL to use when referring to full URLs within the Wagtail admin backend -
# e.g. in notification emails. Don't include '/admin' or a trailing slash
BASE_URL = 'http://example.com'

### production.py ###

from .base import *

DEBUG = True

ALLOWED_HOSTS = ['wagtaildev.wesgarlock.com', '127.0.0.1','134.209.230.125']

from wagtaildev.aws.conf import *

EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'

MAX_UPLOAD_SIZE = "5242880000"
WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
FILE_UPLOAD_TEMP_DIR = str(os.path.join(BASE_DIR, 'tmp'))

2 个答案:

答案 0 :(得分:0)

我怀疑如果Droplet的内存不足,将会发生异常psycopg2.DatabaseError SSL SYSCALL error: Operation timed out

尝试创建交换分区或扩展内存。

Creating a swap partition

答案 1 :(得分:0)

我从来没有能够直接解决这个问题,但是我确实想出了一个解决办法。

我不是Wagtail或Django专家,所以我确定对此答案有适当的解决方案,但是无论如何,这就是我所做的。如果您对改进有任何建议,请随时发表评论。

请注意,这确实是文档,以提醒我我也做了什么。这时有很多多余的代码行(05-25-19),因为我科学怪人在一起编写了很多代码。我将加班时间编辑下来。

这是我科学怪人在一起创建该解决方案的教程。

  1. https://www.codingforentrepreneurs.com/blog/large-file-uploads-with-amazon-s3-django/
  2. http://docs.wagtail.io/en/v2.1.1/advanced_topics/documents/custom_document_model.html
  3. https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
  4. https://medium.com/faun/summary-667d0fdbcdae
  5. http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-browser-credentials-federated-id.html
  6. https://kite.com/python/examples/454/threading-wait-for-a-thread-to-finish
  7. http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#usage-systemd

也许还有其他一些,但这是原则。

好的,我们走了。

我创建了一个名为“ files”的应用,然后自定义文档对models.py文件进行了建模。您需要在设置文件中指定WAGTAILDOCS_DOCUMENT_MODEL ='files.LargeDocument'。我这样做的唯一原因是跟踪我正在更明确地更改的行为。此自定义文档模型只是在Wagtail中扩展了标准文档模型。

#models.py

from django.db import models
from wagtail.documents.models import AbstractDocument
from wagtail.admin.edit_handlers import FieldPanel
# Create your models here.
class LargeDocument(AbstractDocument):

    admin_form_fields = (
        'file',
    )
    panels = [
        FieldPanel('file', classname='fn'),
    ]

接下来,您需要创建一个wagtail_hook.py文件,其中包含以下内容。

#wagtail_hook.py
from wagtail.contrib.modeladmin.options import (
    ModelAdmin, modeladmin_register)
from .models import LargeDocument
from .views import LargeDocumentAdminView


class LargeDocumentAdmin(ModelAdmin):
    model = LargeDocument

    menu_label = 'Large Documents'  # ditch this to use verbose_name_plural from model
    menu_icon = 'pilcrow'  # change as required
    menu_order = 200  # will put in 3rd place (000 being 1st, 100 2nd)
    add_to_settings_menu = False  # or True to add your model to the Settings sub-menu
    exclude_from_explorer = False # or True to exclude pages of this type from Wagtail's explorer view

    create_template_name ='large_document_index.html'

# Now you just need to register your customised ModelAdmin class with Wagtail
modeladmin_register(LargeDocumentAdmin)

这使您可以做两件事:

  1. 创建一个用于上载大型文档的新菜单项,同时保持标准文档菜单项的标准功能。
  2. 指定用于处理大型上传的自定义html文件。

这是html

{% extends "wagtailadmin/base.html" %}
{% load staticfiles cache %}
{% load static wagtailuserbar %}
{% load compress %}
{% load underscore_hyphan_to_space %}
{% load url_vars %}
{% load pagination_value %}

{% load static %}
{% load i18n %}

{% block titletag %}{{ view.page_title }}{% endblock %}

{% block content %}

    {% include "wagtailadmin/shared/header.html" with title=view.page_title icon=view.header_icon %}
          <!-- Google Signin Button -->
          <div class="g-signin2" data-onsuccess="onSignIn" data-theme="dark">
          </div>
          <!-- Select the file to upload -->

          <div class="input-group mb-3">
            <link rel="stylesheet" href="{% static 'css/input.css'%}"/>
            <div class="custom-file">
              <input type="file" class="custom-file-input" id="file" name="file">
              <label id="file_label" class="custom-file-label" style="width:auto!important;" for="inputGroupFile02" aria-describedby="inputGroupFileAddon02">Choose file</label>
            </div>
            <div class="input-group-append">
              <span class="input-group-text" id="file_submission_button">Upload</span>
            </div>
            <div id="start_progress"></div>
          </div>
          <div class="progress-upload">
            <div class="progress-upload-bar" role="progressbar" style="width: 100%;" aria-valuenow="100" aria-valuemin="0" aria-valuemax="100"></div>
          </div>
{% endblock %}

{% block extra_js %}
    {{ block.super }}
    {{ form.media.js }}
    <script src="https://apis.google.com/js/platform.js" async defer></script>
    <script src="https://sdk.amazonaws.com/js/aws-sdk-2.148.0.min.js"></script>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
    <script src="{% static 'js/awsupload.js' %}"></script>
{% endblock %}

{% block extra_css %}
    {{ block.super }}
    {{ form.media.css }}
    <meta name="google-signin-client_id" content="847336061839-9h651ek1dv7u1i0t4edsk8pd20d0lkf3.apps.googleusercontent.com">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">

{% endblock %}

然后我在views.py中创建了一些对象

#views.py
from django.shortcuts import render

# Create your views here.
import base64
import hashlib
import hmac
import os
import time
from rest_framework import permissions, status, authentication
from rest_framework.response import Response
from rest_framework.views import APIView
from .config_aws import (
    AWS_UPLOAD_BUCKET,
    AWS_UPLOAD_REGION,
    AWS_UPLOAD_ACCESS_KEY_ID,
    AWS_UPLOAD_SECRET_KEY
)
from .models import LargeDocument
import datetime
from wagtail.contrib.modeladmin.views import WMABaseView
from django.db.models.fields.files import FieldFile
from django.core.files import File
import urllib.request
from django.core.mail import send_mail
from .tasks import file_creator

class FilePolicyAPI(APIView):
    """
    This view is to get the AWS Upload Policy for our s3 bucket.
    What we do here is first create a LargeDocument object instance in our
    Django backend. This is to include the LargeDocument instance in the path
    we will use within our bucket as you'll see below.
    """
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        """
        The initial post request includes the filename
        and auth credientails. In our case, we'll use
        Session Authentication but any auth should work.
        """
        filename_req = request.data.get('filename')
        if not filename_req:
                return Response({"message": "A filename is required"}, status=status.HTTP_400_BAD_REQUEST)
        policy_expires = int(time.time()+5000)
        user = request.user
        username_str = str(request.user.username)
        """
        Below we create the Django object. We'll use this
        in our upload path to AWS.

        Example:
        To-be-uploaded file's name: Some Random File.mp4
        Eventual Path on S3: <bucket>/username/2312/2312.mp4
        """
        doc_obj = LargeDocument.objects.create(uploaded_by_user=user, )
        doc_obj_id = doc_obj.id
        doc_obj.title=filename_req
        upload_start_path = "{location}".format(
                    location = "LargeDocuments/",
            )
        file_extension = os.path.splitext(filename_req)
        filename_final = "{title}".format(
                    title= filename_req,
                )
        """
        Eventual file_upload_path includes the renamed file to the
        Django-stored LargeDocument instance ID. Renaming the file is
        done to prevent issues with user generated formatted names.
        """
        final_upload_path = "{upload_start_path}/{filename_final}".format(
                                 upload_start_path=upload_start_path,
                                 filename_final=filename_final,
                            )
        if filename_req and file_extension:
            """
            Save the eventual path to the Django-stored LargeDocument instance
            """
            policy_document_context = {
                "expire": policy_expires,
                "bucket_name": AWS_UPLOAD_BUCKET,
                "key_name": "",
                "acl_name": "public-read",
                "content_name": "",
                "content_length": 524288000,
                "upload_start_path": upload_start_path,

                }
            policy_document = """
            {"expiration": "2020-01-01T00:00:00Z",
              "conditions": [
                {"bucket": "%(bucket_name)s"},
                ["starts-with", "$key", "%(upload_start_path)s"],
                {"acl": "public-read"},

                ["starts-with", "$Content-Type", "%(content_name)s"],
                ["starts-with", "$filename", ""],
                ["content-length-range", 0, %(content_length)d]
              ]
            }
            """ % policy_document_context
            aws_secret = str.encode(AWS_UPLOAD_SECRET_KEY)
            policy_document_str_encoded = str.encode(policy_document.replace(" ", ""))
            url = 'https://thearchmedia.s3.amazonaws.com/'
            policy = base64.b64encode(policy_document_str_encoded)
            signature = base64.b64encode(hmac.new(aws_secret, policy, hashlib.sha1).digest())
            doc_obj.file_hash = signature
            doc_obj.path = final_upload_path

            doc_obj.save()



        data = {
            "policy": policy,
            "signature": signature,
            "key": AWS_UPLOAD_ACCESS_KEY_ID,
            "file_bucket_path": upload_start_path,
            "file_id": doc_obj_id,
            "filename": filename_final,
            "url": url,
            "username": username_str,
        }
        return Response(data, status=status.HTTP_200_OK)

class FileUploadCompleteHandler(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        size = request.POST.get('fileSize')
        data = {}
        type_ = request.POST.get('fileType')
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            obj.size = int(size)
            obj.uploaded = True
            obj.type = type_
            obj.file_hash
            obj.save()
            data['id'] = obj.id
            data['saved'] = True
            data['url']=obj.url
        return Response(data, status=status.HTTP_200_OK)

class ModelFileCompletion(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        url = request.POST.get('aws_url')
        data = {}
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            file_creator.delay(obj.pk)
            data['test'] = 'process started'
        return Response(data, status=status.HTTP_200_OK)

def LargeDocumentAdminView(request):
    context = super(WMABaseView, self).get_context(request)
    render(request, 'modeladmin/files/index.html', context)

此视图围绕标准文件处理系统。我不想放弃标准的文件处理系统或编写一个新的文件处理系统。这就是为什么我称这种hack为非理想的解决方案。

// javascript upload file "awsupload.js"
var id_token; //token we get upon Authentication with Web Identiy Provider
function onSignIn(googleUser) {
  var profile = googleUser.getBasicProfile();
  // The ID token you need to pass to your backend:
  id_token = googleUser.getAuthResponse().id_token;
}

$(document).ready(function(){

  // setup session cookie data. This is Django-related
  function getCookie(name) {
      var cookieValue = null;
      if (document.cookie && document.cookie !== '') {
          var cookies = document.cookie.split(';');
          for (var i = 0; i < cookies.length; i++) {
              var cookie = jQuery.trim(cookies[i]);
              // Does this cookie string begin with the name we want?
              if (cookie.substring(0, name.length + 1) === (name + '=')) {
                  cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
                  break;
              }
          }
      }
      return cookieValue;
  }
  var csrftoken = getCookie('csrftoken');
  function csrfSafeMethod(method) {
      // these HTTP methods do not require CSRF protection
      return (/^(GET|HEAD|OPTIONS|TRACE)$/.test(method));
  }
  $.ajaxSetup({
      beforeSend: function(xhr, settings) {
          if (!csrfSafeMethod(settings.type) && !this.crossDomain) {
              xhr.setRequestHeader("X-CSRFToken", csrftoken);
          }
      }
  });
  // end session cookie data setup.

  // declare an empty array for potential uploaded files
  var fileItemList = []

  $(document).on('click','#file_submission_button', function(event){
      var selectedFiles = $('#file').prop('files');
      formItem = $(this).parent()
      $.each(selectedFiles, function(index, item){
          uploadFile(item)
      })
      $(this).val('');
      $('.progress-upload-bar').attr('aria-valuenow',progress);
      $('.progress-upload-bar').attr('width',progress.toString()+'%');
      $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%');
      $('.progress-upload-bar').text(progress.toString()+'%');
  })
  $(document).on('change','#file', function(event){
      var selectedFiles = $('#file').prop('files');
      $('#file_label').text(selectedFiles[0].name)
  })



  function constructFormPolicyData(policyData, fileItem) {
     var contentType = fileItem.type != '' ? fileItem.type : 'application/octet-stream'
      var url = policyData.url
      var filename = policyData.filename
      var repsonseUser = policyData.user
      // var keyPath = 'www/' + repsonseUser + '/' + filename
      var keyPath = policyData.file_bucket_path
      var fd = new FormData()
      fd.append('key', keyPath + filename);
      fd.append('acl','private');
      fd.append('Content-Type', contentType);
      fd.append("AWSAccessKeyId", policyData.key)
      fd.append('Policy', policyData.policy);
      fd.append('filename', filename);
      fd.append('Signature', policyData.signature);
      fd.append('file', fileItem);
      return fd
  }

  function fileUploadComplete(fileItem, policyData){
      data = {
          uploaded: true,
          fileSize: fileItem.size,
          file: policyData.file_id,

      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/complete/",
          success: function(data){
              displayItems(fileItemList)
          },
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function modelComplete(policyData, aws_url){
      data = {
          file: policyData.file_id,
          aws_url: aws_url
      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/modelcomplete/",
          success:
          console.log('model complete success')  ,
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function displayItems(fileItemList){
      var itemList = $('.item-loading-queue')
      itemList.html("")
      $.each(fileItemList, function(index, obj){
          var item = obj.file
          var id_ = obj.id
          var order_ = obj.order
          var html_ = "<div class=\"progress\">" +
            "<div class=\"progress-bar\" role=\"progressbar\" style='width:" + item.progress + "%' aria-valuenow='" + item.progress + "' aria-valuemin=\"0\" aria-valuemax=\"100\"></div></div>"
          itemList.append("<div>" + order_ + ") " + item.name + "<a href='#' class='srvup-item-upload float-right' data-id='" + id_ + ")'>X</a> <br/>" + html_ + "</div><hr/>")

      })
  }

  function uploadFile(fileItem){
          var policyData;
          var newLoadingItem;
          // get AWS upload policy for each file uploaded through the POST method
          // Remember we're creating an instance in the backend so using POST is
          // needed.
          $.ajax({
              method:"POST",
              data: {
                  filename: fileItem.name
              },
              url: "/api/files/policy/",
              success: function(data){
                      policyData = data
              },
              error: function(data){
                  alert("An error occured, please try again later")
              }
          }).done(function(){
              // construct the needed data using the policy for AWS
              var file = fileItem;
              AWS.config.credentials = new AWS.WebIdentityCredentials({
                  RoleArn: 'arn:aws:iam::120974195102:role/thearchmedia-google-role',
                  ProviderId: null, // this is null for Google
                  WebIdentityToken: id_token // Access token from identity provider
              });
              var bucket = 'thearchmedia'
              var key = 'LargeDocuments/'+file.name
              var aws_url = 'https://'+bucket+'.s3.amazonaws.com/'+ key
              var s3bucket = new AWS.S3({params: {Bucket: bucket}});
              var params = {Key: key , ContentType: file.type, Body: file, ACL:'public-read', };
              s3bucket.upload(params, function (err, data) {
                  $('#results').html(err ? 'ERROR!' : 'UPLOADED :' + data.Location);
                }).on(
                  'httpUploadProgress', function(evt) {
                    progress = parseInt((evt.loaded * 100) / evt.total)
                    $('.progress-upload-bar').attr('aria-valuenow',progress)
                    $('.progress-upload-bar').attr('width',progress.toString()+'%')
                    $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%')
                    $('.progress-upload-bar').text(progress.toString()+'%')

                  }).send(
                    function(err, data) {
                      alert("File uploaded successfully.")
                      fileUploadComplete(fileItem, policyData)
                      modelComplete(policyData, aws_url)
                    });
          })
  }


})

.js和.vi​​ew.py交互的说明

首先,在头中带有文件信息的Ajax调用将创建Document对象,但是由于该文件从不接触服务器,因此不会在Document对象中创建“ File”对象。这个“文件”对象包含我需要的功能,因此我需要做更多的事情。接下来,我的javascript文件使用AWS Javascript SDK将文件上传到我的s3存储桶。 SDK中的s3bucket.upload()函数足够强大,可以上传高达5GB的文件,但不包括一些其他修改,它最多可以上传5TB(限制)。将文件上传到s3存储桶后,将发生我的最终API调用。最终的API调用将触发Celery任务,该任务会将文件下载到我的远程服务器上的临时目录中。该文件存在于我的远程服务器上后,即会创建File对象并将其保存到文档模型中。

task.py文件,处理从S3存储桶到远程服务器的文件下载,然后创建并将File对象保存到文档文件。

#task.py
from .models import LargeDocument
from celery import shared_task
import urllib.request
from django.core.mail import send_mail
from django.core.files import File
import threading

@shared_task
def file_creator(pk_num):
    obj = LargeDocument.objects.get(pk=pk_num)
    tmp_loc = 'tmp/'+ obj.title
    def downloadit():
        urllib.request.urlretrieve('https://thearchmedia.s3.amazonaws.com/LargeDocuments/' + obj.title, tmp_loc)

    def after_dwn():
         dwn_thread.join()           #waits till thread1 has completed executing
         #next chunk of code after download, goes here
         send_mail(
             obj.title + ' has finished to downloading to the server',
             obj.title + 'Downloaded to server',
             'info@thearchmedia.com',
             ['wes@wesgarlock.com'],
             fail_silently=False,
         )
         reopen = open(tmp_loc, 'rb')
         django_file = File(reopen)
         obj.file = django_file
         obj.save()
         send_mail(
             obj.title + ' has finished to downloading to the server',
             'File Model Created for' + obj.title,
             'info@thearchmedia.com',
             ['wes@wesgarlock.com'],
             fail_silently=False,
         )

    dwn_thread = threading.Thread(target=downloadit)
    dwn_thread.start()

    metadata_thread = threading.Thread(target=after_dwn)
    metadata_thread.start()

这是需要在Celery中运行的过程,因为下载大文件需要时间,而且我不想在浏览器打开的情况下等待。也是在task.py里面的是一个python thread(),它强制进程等待直到文件成功下载到远程服务器。如果您是Celery的新手,那么这里是他们文档的开始(http://docs.celeryproject.org/en/master/getting-started/introduction.html

我还添加了一些电子邮件通知,以确认流程已完成。

最后的提示:我在项目中创建了一个/ tmp目录,并设置了每天删除所有文件的功能,以使其具有tmp功能。

crontab -e
find ~/thearchmedia/tmp -mtime +1 -delete