如何使用python docx库或任何其他库从Word文档中提取选中复选框的标签?

时间:2019-09-03 13:48:42

标签: python-3.x ms-word lxml docx python-docx

我正在尝试从word文档中提取数据。document是问卷。文档包含表格,表格包含复选框。我要提取已选中复选框的标签。

from docx import Document
import pandas as pd 

wordDoc = Document(path+file)
# I am interested only in first Table
table =  wordDoc.tables[0]

# table's first column contains questions. Second column contains answers. 
# I need only answers
cells = table.columns[1].cells
a = []
for cell in cells:
    if cell._element.xpath('.//w:checkBox') != []:
        checkboxes = cell._element.xpath('.//w:checkBox')
        for checkbox in checkboxes:  
            # here should go some code
    else:
        a.append(cell.text)

0 个答案:

没有答案