Question

我想阅读一个在列和行中都具有多重索引的Excel。在该excel文件中哪些是合并的单元格。

我知道pandas提供了一种读取这种工作表的方法，只需编写“ header = [0,1,2]，index_col = [0,1,2,3]”之类的参数即可自动更改这些标头，并在数据框中将其索引为多索引。

但是当我这样做时，我的列中有很多“未命名：＃”。因此，我首先读取没有索引头的内容，然后使用fillna方法处理那些内容。然后将已处理的行和列更改为df.columns / df.index。

我成功设置了df.columns，然后在下面的链接中得到了一个数据框（对不起，我没有足够的声誉在此页面上发布图片。）

但是当我想将列['Month'，'Week'，'Name'，'Task']设置为索引时出现问题。

返回错误 NotImplementedError：目前不支持1个ndim分类

set_index方法不起作用。无论如何，当列已经是多索引时，是否有将某些列设置为索引？

import re
import os
import pandas as pd

tgt_file = r'C:\some_path\test_input.xlsx'
df = pd.read_excel(tgt_file, header=None, index_col=None)
# I avoided set the dataframe with multiindex when read it from excel by using 'header=[0,1,2],index_col=[0,1,2,3]'
# Cuz I need to read it without header, in order to avoid get too many Unamed: # in the column header

df_header = df[0:3].fillna(method='ffill', axis=0, limit=3)
df.columns = pd.MultiIndex.from_arrays(df_header.values, names=['1st','2nd','3rd'])
df.drop(index=[0,1,2], inplace=True)

df

Then I got an dataframe like this.(Sorry I don't have reputation to paste a picture.)

当列已经是多索引时，如何将某些列设置为索引？

0 个答案: