Question

df1 = pd.read_excel(mxln)  # Loads master xlsx for comparison
df2 = pd.read_excel(sfcn)  # Loads student xlsx for comparison
difference = df2[df2 != df1]  # Scans for differences

如果存在差异，我希望将这些单元格位置存储在列表中。它需要采用'A1'格式（不像[1,1]），所以我可以通过它：

redFill = PatternFill(start_color='FFEE1111', end_color='FFEE1111', fill_type='solid')
lsws['A1'].fill = redFill
lsfh.save(sfcn)

我看过像this这样的解决方案，但我无法让它工作/不理解它。例如，以下内容不起作用：

def highlight_cells():
    df1 = pd.read_excel(mxln)  # Loads master xlsx for comparison
    df2 = pd.read_excel(sfcn)  # Loads student xlsx for comparison
    difference = df2[df2 != df1]  # Scans for differences
    return ['background-color: yellow']

df2.style.apply(highlight_cells)

Answer 1

要获得两个pandas.DataFrame作为excel坐标的差异单元格，您可以执行以下操作：

<强>代码：

def diff_cell_indices(dataframe1, dataframe2):
    from openpyxl.utils import get_column_letter as column_letter

    x_ofs = dataframe1.columns.nlevels + 1
    y_ofs = dataframe1.index.nlevels + 1
    return [column_letter(x + x_ofs) + str(y + y_ofs) for
            y, x in zip(*np.where(dataframe1 != dataframe2))]

测试代码：

import pandas as pd
df1 = pd.read_excel('test.xlsx')
print(df1)

df2 = df.copy()
df2.C['R2'] = 1
print(df2)

print(diff_cell_indices(df1, df2))

<强>结果：

    B  C
R2  2  3
R3  4  5

    B  C
R2  2  1
R3  4  5

['C2']

如何从pandas diff获取单元格位置？

1 个答案: