Question

我有一个包含调查评论的数据框。每一位受访者的组号都有一栏。然后有几列包含标题行中的问题文本和后续行中的响应。并非每个人都回答了每个问题，所以有空白单元格。

我想使用docx包将注释输出到Word文件。我想将问题文本显示为标题，下面将组号显示为标题（按组号对响应进行分组），下面将其显示在项目符号列表中，然后移至下一个问题并重复。另外，我也不想输出空白单元格。

下面的代码给出了我要做什么的想法。

import docx
import pandas as pd
from docx import Document
import numpy as np
from docx.shared import Inches
from docx.enum.section import WD_SECTION
from docx.enum.section import WD_ORIENT

# initialize list of lists 
data = [['Group 1', 'Comment A', 'Comment B', 'Comment C'], ['Group 2', 'Comment D', '', ''], ['Group 2', 'Comment E', '', 'Comment F'], ['Group 1', '', 'Comment G', 'Comment H'], ] 

# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['Group', 'Question 1', 'Question 2', 'Question 3']) 
print(df)

# create file
doc = Document()

sections = doc.sections
section = sections[0]

# Convert to landscape orientation
new_width, new_height = section.page_height, section.page_width
section.orientation = WD_ORIENT.LANDSCAPE
section.page_width = new_width
section.page_height = new_height

# Document Title
doc.add_heading('Document Title', level=0)

# Opening text
doc.add_paragraph('Some text...')

# Do I need to sort by 'Group' before doing the loops?

# loop through the questions - this isn't working
for column in df[2:]:
    # create a heading for each question
    doc.add_heading(column, level=1)
    for g in df.Group:
        # create a heading for each question
        doc.add_heading(g, level=3)
        for c in df[g]:
            doc.add_paragraph(c, style='List Bullet')

# save the doc
doc.save('./test.docx')

输出为：

Document Title

Some text...

Question 1

Group 1
 - Comment A

Group 2
 - Comment D
 - Comment E

Question 2

Group 1
 - Comment B
 - Comment G

Question 3

Group 1
 - Comment C
 - Comment H

Group 2
 - Comment F

Answer 1

这适用于循环：

# loop through the questions
for column in df.columns[1:]:
    # create a heading for each question
    doc.add_heading(column, level=3)
    ###Make a new dataframe with only Group and column of interest
    new_df = df[['Group', column]]
    ###Make list of all units
    unit_list = list(new_df['Group'].unique())
    ###Make list of comments in each unit for this column
    for unit in unit_list:
        comments = [row[2] for row in new_df.itertuples() if row[1] == unit]
        comments = [i for i in comments if len(i) > 0]
        ###If there were any comments in this unit, add the unit as a subheader
        if len(comments) > 0:
            doc.add_heading(unit, level=4)
            # Bullet list of comments
            for c in comments:
                doc.add_paragraph(c, style='List Bullet')

嵌套循环可从Pandas数据框创建docx文件

1 个答案: