Question

我为我当前的项目编写了一个自定义的Django文件上传处理程序。这是一个概念验证，它允许您计算上传文件的哈希值，而无需将该文件存储在磁盘上。当然，这是一个概念证明，但如果我能让它发挥作用，我就可以实现我工作的真正目的。

基本上，这是我到目前为止所做的事情，它有一个很好的例外：

from django.core.files.uploadhandler import *
from hashlib import sha256
from myproject.upload.files import MyProjectUploadedFile

class MyProjectUploadHandler(FileUploadHandler):
    def __init__(self, *args, **kwargs):
        super(MyProjectUploadHandler, self).__init__(*args, **kwargs)

    def handle_raw_input(self, input_data, META, content_length, boundary,
            encoding = None):
        self.activated = True

    def new_file(self, *args, **kwargs):
        super(MyProjectUploadHandler, self).new_file(*args, **kwargs)

        self.digester = sha256()
        raise StopFutureHandlers()

    def receive_data_chunk(self, raw_data, start):
        self.digester.update(raw_data)

    def file_complete(self, file_size):
        return MyProjectUploadedFile(self.digester.hexdigest())

自定义上传处理程序效果很好。哈希是准确的，无需将任何上传的文件存储到磁盘，并且一次只能使用64kb的内存。

我遇到的唯一问题是我需要在处理文件之前从POST请求访问另一个字段，即用户输入的文本salt。我的表格如下：

<form id="myForm" method="POST" enctype="multipart/form-data" action="/upload/">
    <fieldset>
        <input name="salt" type="text" placeholder="Salt">
        <input name="uploadfile" type="file">
        <input type="submit">
    </fieldset>
</form>

“salt”POST变量仅在处理完请求并上传文件后才可用，这对我的用例不起作用。我似乎找不到在我的上传处理程序中以任何方式，形状或形式访问此变量的方法。

我是否有办法访问每个多部分变量，而不仅仅是访问上传的文件？

Answer 1

我的解决方案并不容易，但现在是：

class IntelligentUploadHandler(FileUploadHandler):
    """
    An upload handler which overrides the default multipart parser to allow
    simultaneous parsing of fields and files... intelligently. Subclass this
    for real and true awesomeness.
    """

    def __init__(self, *args, **kwargs):
        super(IntelligentUploadHandler, self).__init__(*args, **kwargs)

    def field_parsed(self, field_name, field_value):
        """
        A callback method triggered when a non-file field has been parsed 
        successfully by the parser. Use this to listen for new fields being
        parsed.
        """
        pass

    def handle_raw_input(self, input_data, META, content_length, boundary,
            encoding = None):
        """
        Parse the raw input from the HTTP request and split items into fields
        and files, executing callback methods as necessary.

        Shamelessly adapted and borrowed from django.http.multiparser.MultiPartParser.
        """
        # following suit from the source class, this is imported here to avoid
        # a potential circular import
        from django.http import QueryDict

        # create return values
        self.POST = QueryDict('', mutable=True)
        self.FILES = MultiValueDict()

        # initialize the parser and stream
        stream = LazyStream(ChunkIter(input_data, self.chunk_size))

        # whether or not to signal a file-completion at the beginning of the loop.
        old_field_name = None
        counter = 0

        try:
            for item_type, meta_data, field_stream in Parser(stream, boundary):
                if old_field_name:
                    # we run this test at the beginning of the next loop since
                    # we cannot be sure a file is complete until we hit the next
                    # boundary/part of the multipart content.
                    file_obj = self.file_complete(counter)

                    if file_obj:
                        # if we return a file object, add it to the files dict
                        self.FILES.appendlist(force_text(old_field_name, encoding,
                            errors='replace'), file_obj)

                    # wipe it out to prevent havoc
                    old_field_name = None
                try: 
                    disposition = meta_data['content-disposition'][1]
                    field_name = disposition['name'].strip()
                except (KeyError, IndexError, AttributeError):
                    continue

                transfer_encoding = meta_data.get('content-transfer-encoding')

                if transfer_encoding is not None:
                    transfer_encoding = transfer_encoding[0].strip()

                field_name = force_text(field_name, encoding, errors='replace')

                if item_type == FIELD:
                    # this is a POST field
                    if transfer_encoding == "base64":
                        raw_data = field_stream.read()
                        try:
                            data = str(raw_data).decode('base64')
                        except:
                            data = raw_data
                    else:
                        data = field_stream.read()

                    self.POST.appendlist(field_name, force_text(data, encoding,
                        errors='replace'))

                    # trigger listener
                    self.field_parsed(field_name, self.POST.get(field_name))
                elif item_type == FILE:
                    # this is a file
                    file_name = disposition.get('filename')

                    if not file_name:
                        continue

                    # transform the file name
                    file_name = force_text(file_name, encoding, errors='replace')
                    file_name = self.IE_sanitize(unescape_entities(file_name))

                    content_type = meta_data.get('content-type', ('',))[0].strip()

                    try:
                        charset = meta_data.get('content-type', (0, {}))[1].get('charset', None)
                    except:
                        charset = None

                    try:
                        file_content_length = int(meta_data.get('content-length')[0])
                    except (IndexError, TypeError, ValueError):
                        file_content_length = None

                    counter = 0

                    # now, do the important file stuff
                    try:
                        # alert on the new file
                        self.new_file(field_name, file_name, content_type,
                                file_content_length, charset)

                        # chubber-chunk it
                        for chunk in field_stream:
                            if transfer_encoding == "base64":
                                # base 64 decode it if need be
                                over_bytes = len(chunk) % 4

                                if over_bytes:
                                    over_chunk = field_stream.read(4 - over_bytes)
                                    chunk += over_chunk

                                try:
                                    chunk = base64.b64decode(chunk)
                                except Exception as e:
                                    # since this is anly a chunk, any error is an unfixable error
                                    raise MultiPartParserError("Could not decode base64 data: %r" % e)

                            chunk_length = len(chunk)
                            self.receive_data_chunk(chunk, counter)
                            counter += chunk_length
                            # ... and we're done
                    except SkipFile:
                        # just eat the rest
                        exhaust(field_stream)
                    else:
                        # handle file upload completions on next iteration
                        old_field_name = field_name

        except StopUpload as e:
            # if we get a request to stop the upload, exhaust it if no con reset
            if not e.connection_reset:
                exhaust(input_data)
        else:
            # make sure that the request data is all fed
            exhaust(input_data)

        # signal the upload has been completed
        self.upload_complete()

        return self.POST, self.FILES

    def IE_sanitize(self, filename):
        """Cleanup filename from Internet Explorer full paths."""
        return filename and filename[filename.rfind("\\")+1:].strip()

基本上，通过继承此类，您可以拥有更多...智能上传处理程序。字段将使用field_parsed方法通知子类，因为我需要用于我的目的。

我已将此报告为Django团队的feature request，希望此功能成为Django中常规工具箱的一部分，而不是如上所述对源代码进行猴子修补。

Answer 2

基于FileUploadHandler的代码，可在第62行找到：

https://github.com/django/django/blob/master/django/core/files/uploadhandler.py

看起来请求对象被传递到处理程序并存储为self.request

在这种情况下，您应该能够通过执行

在上传处理程序中的任何位置访问salt

salt = self.request.POST.get('salt')

除非我误解你的问题。

访问自定义Django上载处理程序中的其他表单域

2 个答案: