Pandas:使用合并的单元格和空白值解析 Excel 电子表格

时间:2021-03-22 13:24:16

标签: python excel pandas

我的问题类似于this one。我有一个包含一些合并单元格的电子表格,但包含合并单元格的列也有空单元格,例如:

namespace micro\controllers;
class ActiveController extends yii\rest\ActiveController {

    public function behaviors() {
        $behaviors = parent::behaviors();
        $behaviors['contentNegotiator'] = [
            'class' => 'yii\filters\ContentNegotiator',
            'formats' => [
                'application/json' => \yii\web\Response::FORMAT_JSON,
            ]
        ];
        return $behaviors;
    }
}

如何将其解析为 Pandas DataFrame?我知道 Day Sample CD4 CD8 ---------------------------- Day 1 8311 17.3 6.44 -------------------- 8312 13.6 3.50 -------------------- 8321 19.8 5.88 -------------------- 8322 13.5 4.09 ---------------------------- Day 2 8311 16.0 4.92 -------------------- 8312 5.67 2.28 -------------------- 8321 13.0 4.34 -------------------- 8322 10.6 1.95 ---------------------------- 8323 16.0 4.92 ---------------------------- 8324 5.67 2.28 ---------------------------- 8325 13.0 4.34 方法不会解决我的问题,因为它会用其他东西替换实际缺失的值。我想获得这样的 DataFrame:

fillna(method='ffill')

1 个答案:

答案 0 :(得分:1)

假设您知道 excel 文件的起始行(或者想出更好的方法来检查),这样的事情应该可以工作

import pandas as pd
import numpy as np
import openpyxl
def test():
    filepath = "C:\\Users\\me\\Desktop\\SO nonsense\\PandasMergeCellTest.xlsx"
    df = pd.read_excel(filepath)
    wb = openpyxl.load_workbook(filepath)
    sheet = wb["Sheet1"]
    df["Row"] = np.arange(len(df)) + 2 #My headers were row 1 so adding 2 to get the row numbers
    df["Merged"] = df.apply(lambda x: checkMerged(x, sheet), axis=1)
    df["Day"] = np.where(df["Merged"] == True, df["Day"].ffill(), np.nan)
    df = df.drop(["Row", "Merged"], 1)
    print(df)

def checkMerged(x, sheet):
    cell = sheet.cell(x["Row"], 1)
    for mergedcell in sheet.merged_cells.ranges:
        if(cell.coordinate in mergedcell):
            return True

test()
相关问题