使用pandas从excel读取下拉值

时间:2018-06-18 20:59:27

标签: python excel python-3.x pandas xlsx

我有一个下拉单元格的excel。我一直在尝试阅读excel下拉列表,但它只读取所选的选项。

enter image description here

import pandas

df = pandas.read_excel("BQA.xlsx", header=0)
df.columns = df.columns.str.strip()

print(df)

输出:

 Empty DataFrame
 Columns: [Column 1, Column 2, Column 3, Column 4, yes]
 Index: []

预期产出:

Empty DataFrame
Columns: [Column 1, Column 2, Column 3, Column 4, [yes, no, yes1, no1]]
Index: []

1 个答案:

答案 0 :(得分:1)

您可以使用x = 1 y = 2 print(y) 提取下拉信息:它存储在给定工作表的data_validations中。例如(为了便于阅读,插入了换行符):

openpyxl

我不会处理所有可能的情况,所以这只是您可以可以做的事情的一个例子,但是类似

>>> wb = openpyxl.load_workbook("dropdown.xlsx")
>>> ws = wb["Sheet1"]
>>> ws.data_validations
<openpyxl.worksheet.datavalidation.DataValidationList object>
Parameters:
disablePrompts=None, xWindow=None, yWindow=None, count=1, 
dataValidation=[<openpyxl.worksheet.datavalidation.DataValidation object>
Parameters:
sqref=<MultiCellRange [E1]>, showErrorMessage=True, showDropDown=None, showInputMessage=True, 
allowBlank=False, errorTitle=None, error=None, promptTitle=None, prompt=None,
type='list', errorStyle=None, imeMode=None, operator=None, formula1='$L$4:$L$7', formula2=None]

给我(再次插入换行符):

def read_with_dropdown(book_name, sheet_name, range_str):
    wb = openpyxl.load_workbook(book_name)
    ws = wb[sheet_name]
    data = [[cell.value for cell in row] for row in ws[range_str]]

    validations = ws.data_validations.dataValidation
    for validation in validations:
        ranges = validation.sqref.ranges
        if len(ranges) != 1:
            raise NotImplementedError
        if validation.type == 'list':
            list_cells = ws[validation.formula1]
            values = [cell.value for cell_row in list_cells for cell in cell_row]
        else:
            raise NotImplementedError
        bounds = ranges[0].bounds
        try:
            data[bounds[1]-1][bounds[0]-1] = values
        except IndexError:
            pass
    return data