在Word文档中以表格的形式获取数据框

时间:2020-09-16 04:52:46

标签: python pandas numpy docx

我正在读取Excel文件,提取特定的df并将其放入Word文档中。我面临的问题是:

    一旦添加到para中,
  1. DF将失去形状。变得完全没用。

完整代码如下。

#importing required libraries
import pandas as pd
import numpy as np
eod = pd.read_excel('df.xlsx')
import datetime
import docx 
from datetime import date
legal = docx.Document('legal.docx')

#Calculating No. days from SCN
eod['SCN Days'] = (pd.Timestamp('now').floor('d') - eod['SCN Date']).dt.days

#Generation list of EFE for Final Showcause Notice to be issued today
FSCN_today = eod.where(eod['SCN Days']>20)
#Dropping Null from generated list
FSCN_today = FSCN_today.dropna(how ="all")
FSCN_today = FSCN_today[['Exporter Name','EFE','DESTINATION','VALUE']]

#Getting Unique Values in the list generated
s_values = FSCN_today['Exporter Name'].unique()

#Iterating through List
for c in s_values:
    df1 = FSCN_today[FSCN_today['Exporter Name'] == c]
    legal.paragraphs[7].text = c
    legal.paragraphs[8].text = df1.iloc[10:1]
    legal.paragraphs[15].text = str(df1)
    notice_name = str(c)+ ".docx"
    legal.save(notice_name)

#Update Date & Status of FSCN Issued today
eod['FSCN Date'] = np.where((eod['Status']=="SCN ISSUED") & (eod['SCN Days']>20),date.today(),eod['FSCN Date'])
eod['Status'] = np.where((eod['Status']=="SCN ISSUED") & (eod['SCN Days']>20),"FSCN ISSUED",eod['Status'])

#In progress
name = "EOD "+ str(date.today())+ ".xlsx"
#eod.to_excel(name,index =False)  

后面的行有错误。

legal.paragraphs[15].text = str(df1)

3 个答案:

答案 0 :(得分:0)

我注意到legal.paragraphs[8].text = df1.iloc[10:1]看起来很奇怪。

如果将其更改为legal.paragraphs[8].text = df1[10:1].iloc,则对我来说,生成的.docx文件看起来更合理。

我不知道您想要的输出是什么,所以这是我最好的猜测。

答案 1 :(得分:0)

我从没与python-docx合作,所以我很确定自己的尝试是次优的。以下确实处理了示例数据。

基本上,我已经向文档中添加了一个表,并将列标签和DataFrame的内容插入到了表中。有一些我无法解决的令人讨厌的部分(我访问_paragraph的{​​{1}}属性的部分)。

我在上面替换了代码的以下部分

table

与此(注释突出了我所做的事情,换行以提高可读性):

#Iterating through List
for c in s_values:
    df1 = FSCN_today[FSCN_today['Exporter Name'] == c]
    legal.paragraphs[7].text = c
    legal.paragraphs[8].text = df1.iloc[10:1]
    legal.paragraphs[15].text = str(df1)
    notice_name = str(c)+ ".docx"
    legal.save(notice_name)

答案 2 :(得分:0)

您可以通过创建表,将数据框传输到该表(as explained in this post)并将该表放置在legal.paragraphs [15]所在的位置来实现此目的:

#importing required libraries
import pandas as pd
import numpy as np
eod = pd.read_excel('df.xlsx')
import datetime
import docx 
from datetime import date

#Calculating No. days from SCN
eod['SCN Days'] = (pd.Timestamp('now').floor('d') - eod['SCN Date']).dt.days

#Generation list of EFE for Final Showcause Notice to be issued today
FSCN_today = eod.where(eod['SCN Days']>20)
#Dropping Null from generated list
FSCN_today = FSCN_today.dropna(how ="all")
FSCN_today = FSCN_today[['Exporter Name','EFE','DESTINATION','VALUE']]

#Getting Unique Values in the list generated
s_values = FSCN_today['Exporter Name'].unique()

#Iterating through List
for c in s_values:
    legal = docx.Document('legal.docx')
    df1 = FSCN_today[FSCN_today['Exporter Name'] == c]
    legal.paragraphs[7].text = c
    legal.paragraphs[8].text = df1.iloc[10:1].iloc
    legal.paragraphs[15].text = ""
    t = legal.add_table(df1.shape[0]+1, df1.shape[1])
    for j in range(df1.shape[-1]):
        t.cell(0,j).text = df1.columns[j]
    for i in range(df1.shape[0]):
        for j in range(df1.shape[-1]):
            t.cell(i+1,j).text = str(df1.values[i,j])    
    legal.paragraphs[15]._p.addnext(t._tbl)
    notice_name = str(c)+ ".docx"
    legal.save(notice_name)

#Update Date & Status of FSCN Issued today
eod['FSCN Date'] = np.where((eod['Status']=="SCN ISSUED") & (eod['SCN Days']>20),date.today(),eod['FSCN Date'])
eod['Status'] = np.where((eod['Status']=="SCN ISSUED") & (eod['SCN Days']>20),"FSCN ISSUED",eod['Status'])

#In progress
name = "EOD "+ str(date.today())+ ".xlsx"
#eod.to_excel(name,index =False) 

(我将legal = docx.Document('legal.docx')移到了循环中,因为连续的docx保留了较早的导出器值)