Django Pandas到http响应(下载文件)

时间:2016-02-08 10:38:26

标签: python django pandas

Python:2.7.11

Django:1.9

熊猫:0.17.1

我应该如何创建可能较大的xlsx文件下载?我正在从字典列表中创建一个带有pandas的xlsx文件,现在需要让用户可以下载它。该列表位于变量中,不允许在本地保存(在服务器上)。

示例:

df = pandas.DataFrame(self.csvdict)
writer = pandas.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

这个例子只是创建文件并将其保存在执行脚本所在的位置。我需要的是将其创建为http响应,以便用户获得下载提示。

我找到了一些关于为xlsxwriter执行此操作的帖子,但是对于pandas没有。我也认为我应该使用'StreamingHttpResponse'而不是'HttpResponse'。

6 个答案:

答案 0 :(得分:9)

我将详细说明@jmcnamara所写的内容。这适用于最新版本的Excel,Pandas和Django。 import语句位于views.py的顶部,其余代码可以在视图中:

import pandas as pd
from django.http import HttpResponse
try:
    from io import BytesIO as IO # for modern python
except ImportError:
    from io import StringIO as IO # for legacy python

# this is my output data a list of lists
output = some_function()
df_output = pd.DataFrame(output)

# my "Excel" file, which is an in-memory output file (buffer) 
# for the new workbook
excel_file = IO()

xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')

df_output.to_excel(xlwriter, 'sheetname')

xlwriter.save()
xlwriter.close()

# important step, rewind the buffer or when it is read() you'll get nothing
# but an error message when you try to open your zero length file in Excel
excel_file.seek(0)

# set the mime type so that the browser knows what to do with the file
response = HttpResponse(excel_file.read(), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')

# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'

return response

答案 1 :(得分:8)

Jmcnamara指向你的方向。翻译成您的问题,您正在寻找以下代码:

sio = StringIO()
PandasDataFrame = pandas.DataFrame(self.csvdict)
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name=sheetname)
PandasWriter.save()

sio.seek(0)
workbook = sio.getvalue()

response = StreamingHttpResponse(workbook, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % filename

请注意,您要将数据保存到StringIO变量而不是文件位置。这样就可以在生成响应之前阻止文件的保存。

答案 2 :(得分:2)

Just wanted to share a class-based view approach to this, using elements from the answers above. Just override the get method of a Django View. My model has a JSON field which contains the results of dumping a dataframe to JSON with the to_json method.

Python version is 3.6 with Django 1.11.

# models.py
from django.db import models
from django.contrib.postgres.fields import JSONField

class myModel(models.Model):
    json_field = JSONField(verbose_name="JSON data")

# views.py
import pandas as pd
from io import BytesIO as IO
from django.http import HttpResponse
from django.views import View

from .models import myModel

class ExcelFileDownloadView(View):
    """
    Allows the user to download records in an Excel file
    """

    def get(self, request, *args, **kwargs):

        obj = myModel.objects.get(pk=self.kwargs['pk'])
        excel_file = IO()
        xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
        pd.read_json(obj.json_field).to_excel(xlwriter, "Summary")
        xlwriter.save()
        xlwriter.close()

        excel_file.seek(0)

        response = HttpResponse(excel_file.read(),
                                content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')

        response['Content-Disposition'] = 'attachment; filename="excel_file.xlsx"'
        return response

# urls.py
from django.conf.urls import url
from .views import ExcelFileDownloadView

urlpatterns = [
    url(r'^mymodel/(?P<pk>\d+)/download/$', ExcelFileDownloadView.as_view(), name="excel-download"),]

答案 3 :(得分:1)

使用Pandas 0.17+,您可以使用StringIO / BytesIO对象作为pd.ExcelWriter的文件句柄。例如:

import pandas as pd
import StringIO

output = StringIO.StringIO()

# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')

# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = output.getvalue()

print len(xlsx_data)

之后是XlsxWriter Python 2/3 HTTP examples

对于旧版本的Pandas,您可以使用此workaround

答案 4 :(得分:1)

也许有点题外话,但是值得指出的是to_csv方法通常比to_excel更快,因为excel包含工作表的格式信息。如果您只有数据而不是格式信息,请考虑使用to_csv。 Microsoft Excel可以毫无问题地查看和编辑csv文件。

使用to_csv的好处是to_csv函数可以将任何类似于文件的对象作为第一个参数,而不仅仅是文件名字符串。由于Django响应对象类似于文件,因此to_csv函数可以直接向其写入。视图函数中的一些代码如下所示:

df = <your dataframe to be downloaded>
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=<default filename you wanted to give to the downloaded file>'
df.to_csv(response, index=False)
return response

参考:

  1. https://gist.github.com/jonperron/733c3ead188f72f0a8a6f39e3d89295d
  2. https://docs.djangoproject.com/en/2.1/howto/outputting-csv/

答案 5 :(得分:0)

您要混合两个应该分开的要求:

  1. 使用python或pandas创建.xlsx文件 - 看起来你在这方面做得很好。

  2. 提供可下载的文件(django);请参阅this postmaybe this one