Question

我有一个流程，Web服务器通过上传注入文件（通过上传），使用default_storages将该文件保存到S3，然后通过芹菜为后端处理该文件创建任务。

def upload_file(request):
  path = 'uploads/my_file.csv'
  with default_storage.open(path, 'w') as file:
    file.write(request.FILES['upload'].read().decode('utf-8-sig'))
  process_upload.delay(path)
  return HttpResponse()

@shared_task
def process_upload(path):
  with default_storage.open(path, 'r') as file:
    dialect = csv.Sniffer().sniff(file.read(1024]))
    file.seek(0)
    reader = csv.DictReader(content, dialect=dialect)
    for row in reader:
      # etc...

问题在于，虽然我在写入和读取时明确地使用了文本模式，但是当我读取文件时，它会以bytes的形式出现，而csv库无法处理。如果没有读入并解码内存中的整个文件，有什么方法吗？

Answer 1

似乎您需要将b（二进制模式）添加到open调用中：

来自docs：

附加到该模式的
'b'以二进制模式打开文件：现在，数据以字节对象的形式读写。 此模式应用于所有不包含文本的文件。

@shared_task
def process_upload(path):
  with default_storage.open(path, 'rb') as file:
      # Rest of your code goes here.

Celery，Django和S3默认存储会导致文件读取问题

1 个答案: