我为我当前的项目编写了一个自定义的Django文件上传处理程序。这是一个概念验证,它允许您计算上传文件的哈希值,而无需将该文件存储在磁盘上。当然,这是一个概念证明,但如果我能让它发挥作用,我就可以实现我工作的真正目的。
基本上,这是我到目前为止所做的事情,它有一个很好的例外:
from django.core.files.uploadhandler import *
from hashlib import sha256
from myproject.upload.files import MyProjectUploadedFile
class MyProjectUploadHandler(FileUploadHandler):
def __init__(self, *args, **kwargs):
super(MyProjectUploadHandler, self).__init__(*args, **kwargs)
def handle_raw_input(self, input_data, META, content_length, boundary,
encoding = None):
self.activated = True
def new_file(self, *args, **kwargs):
super(MyProjectUploadHandler, self).new_file(*args, **kwargs)
self.digester = sha256()
raise StopFutureHandlers()
def receive_data_chunk(self, raw_data, start):
self.digester.update(raw_data)
def file_complete(self, file_size):
return MyProjectUploadedFile(self.digester.hexdigest())
自定义上传处理程序效果很好。哈希是准确的,无需将任何上传的文件存储到磁盘,并且一次只能使用64kb的内存。
我遇到的唯一问题是我需要在处理文件之前从POST请求访问另一个字段,即用户输入的文本salt。我的表格如下:
<form id="myForm" method="POST" enctype="multipart/form-data" action="/upload/">
<fieldset>
<input name="salt" type="text" placeholder="Salt">
<input name="uploadfile" type="file">
<input type="submit">
</fieldset>
</form>
“salt”POST变量仅在处理完请求并上传文件后才可用,这对我的用例不起作用。我似乎找不到在我的上传处理程序中以任何方式,形状或形式访问此变量的方法。
我是否有办法访问每个多部分变量,而不仅仅是访问上传的文件?
答案 0 :(得分:2)
我的解决方案并不容易,但现在是:
class IntelligentUploadHandler(FileUploadHandler):
"""
An upload handler which overrides the default multipart parser to allow
simultaneous parsing of fields and files... intelligently. Subclass this
for real and true awesomeness.
"""
def __init__(self, *args, **kwargs):
super(IntelligentUploadHandler, self).__init__(*args, **kwargs)
def field_parsed(self, field_name, field_value):
"""
A callback method triggered when a non-file field has been parsed
successfully by the parser. Use this to listen for new fields being
parsed.
"""
pass
def handle_raw_input(self, input_data, META, content_length, boundary,
encoding = None):
"""
Parse the raw input from the HTTP request and split items into fields
and files, executing callback methods as necessary.
Shamelessly adapted and borrowed from django.http.multiparser.MultiPartParser.
"""
# following suit from the source class, this is imported here to avoid
# a potential circular import
from django.http import QueryDict
# create return values
self.POST = QueryDict('', mutable=True)
self.FILES = MultiValueDict()
# initialize the parser and stream
stream = LazyStream(ChunkIter(input_data, self.chunk_size))
# whether or not to signal a file-completion at the beginning of the loop.
old_field_name = None
counter = 0
try:
for item_type, meta_data, field_stream in Parser(stream, boundary):
if old_field_name:
# we run this test at the beginning of the next loop since
# we cannot be sure a file is complete until we hit the next
# boundary/part of the multipart content.
file_obj = self.file_complete(counter)
if file_obj:
# if we return a file object, add it to the files dict
self.FILES.appendlist(force_text(old_field_name, encoding,
errors='replace'), file_obj)
# wipe it out to prevent havoc
old_field_name = None
try:
disposition = meta_data['content-disposition'][1]
field_name = disposition['name'].strip()
except (KeyError, IndexError, AttributeError):
continue
transfer_encoding = meta_data.get('content-transfer-encoding')
if transfer_encoding is not None:
transfer_encoding = transfer_encoding[0].strip()
field_name = force_text(field_name, encoding, errors='replace')
if item_type == FIELD:
# this is a POST field
if transfer_encoding == "base64":
raw_data = field_stream.read()
try:
data = str(raw_data).decode('base64')
except:
data = raw_data
else:
data = field_stream.read()
self.POST.appendlist(field_name, force_text(data, encoding,
errors='replace'))
# trigger listener
self.field_parsed(field_name, self.POST.get(field_name))
elif item_type == FILE:
# this is a file
file_name = disposition.get('filename')
if not file_name:
continue
# transform the file name
file_name = force_text(file_name, encoding, errors='replace')
file_name = self.IE_sanitize(unescape_entities(file_name))
content_type = meta_data.get('content-type', ('',))[0].strip()
try:
charset = meta_data.get('content-type', (0, {}))[1].get('charset', None)
except:
charset = None
try:
file_content_length = int(meta_data.get('content-length')[0])
except (IndexError, TypeError, ValueError):
file_content_length = None
counter = 0
# now, do the important file stuff
try:
# alert on the new file
self.new_file(field_name, file_name, content_type,
file_content_length, charset)
# chubber-chunk it
for chunk in field_stream:
if transfer_encoding == "base64":
# base 64 decode it if need be
over_bytes = len(chunk) % 4
if over_bytes:
over_chunk = field_stream.read(4 - over_bytes)
chunk += over_chunk
try:
chunk = base64.b64decode(chunk)
except Exception as e:
# since this is anly a chunk, any error is an unfixable error
raise MultiPartParserError("Could not decode base64 data: %r" % e)
chunk_length = len(chunk)
self.receive_data_chunk(chunk, counter)
counter += chunk_length
# ... and we're done
except SkipFile:
# just eat the rest
exhaust(field_stream)
else:
# handle file upload completions on next iteration
old_field_name = field_name
except StopUpload as e:
# if we get a request to stop the upload, exhaust it if no con reset
if not e.connection_reset:
exhaust(input_data)
else:
# make sure that the request data is all fed
exhaust(input_data)
# signal the upload has been completed
self.upload_complete()
return self.POST, self.FILES
def IE_sanitize(self, filename):
"""Cleanup filename from Internet Explorer full paths."""
return filename and filename[filename.rfind("\\")+1:].strip()
基本上,通过继承此类,您可以拥有更多...智能上传处理程序。字段将使用field_parsed
方法通知子类,因为我需要用于我的目的。
我已将此报告为Django团队的feature request,希望此功能成为Django中常规工具箱的一部分,而不是如上所述对源代码进行猴子修补。
答案 1 :(得分:0)
基于FileUploadHandler
的代码,可在第62行找到:
https://github.com/django/django/blob/master/django/core/files/uploadhandler.py
看起来请求对象被传递到处理程序并存储为self.request
在这种情况下,您应该能够通过执行
在上传处理程序中的任何位置访问saltsalt = self.request.POST.get('salt')
除非我误解你的问题。