我有一个带有多索引的数据框。我想在满足第一个索引的某些条件时更改第二个索引的值。 我在这里找到了一个类似(但不同)的问题:Replace a value in MultiIndex (pandas) 这没有回答我的观点,因为那是关于改变单行,并且解决方案也传递了第一个索引的值(不需要更改)。在我的情况下,我正在处理多行,我无法使该解决方案适应我的情况。
我的数据的最小示例如下。谢谢!
import pandas as pd
import numpy as np
consdf=pd.DataFrame()
for mylocation in ['North','South']:
for scenario in np.arange(1,4):
df= pd.DataFrame()
df['mylocation'] = [mylocation]
df['scenario']= [scenario]
df['this'] = np.random.randint(10,100)
df['that'] = df['this'] * 2
df['something else'] = df['this'] * 3
consdf=pd.concat((consdf, df ), axis=0, ignore_index=True)
mypiv = consdf.pivot('mylocation','scenario').transpose()
level_list =['this','that']
# if level 0 is in level_list --> set level 1 to np.nan
mypiv.iloc[mypiv.index.get_level_values(0).isin(level_list)].index.set_levels([np.nan], level =1, inplace=True)
最后一行不起作用:我得到:
ValueError: On level 1, label max (2) >= length of level (1). NOTE: this index is in an inconsistent state
答案 0 :(得分:1)
注意: ix
已在Pandas 0.20+中弃用。请改用loc
访问者。
以下是使用reset_index()
方法的解决方案:
In [95]: new = mypiv.reset_index()
In [96]: new
Out[96]:
mylocation level_0 scenario North South
0 this 1 32 64
1 this 2 18 40
2 this 3 76 56
3 that 1 64 128
4 that 2 36 80
5 that 3 152 112
6 something else 1 96 192
7 something else 2 54 120
8 something else 3 228 168
In [100]: new.ix[new.level_0.isin(level_list), 'scenario'] = np.nan
In [101]: new
Out[101]:
mylocation level_0 scenario North South
0 this NaN 32 64
1 this NaN 18 40
2 this NaN 76 56
3 that NaN 64 128
4 that NaN 36 80
5 that NaN 152 112
6 something else 1.0 96 192
7 something else 2.0 54 120
8 something else 3.0 228 168
In [103]: mypiv = new.set_index(['level_0', 'scenario'])
In [104]: mypiv
Out[104]:
mylocation North South
level_0 scenario
this NaN 32 64
NaN 18 40
NaN 76 56
that NaN 64 128
NaN 36 80
NaN 152 112
something else 1.0 96 192
2.0 54 120
3.0 228 168
但我怀疑有更优雅的解决方案。
答案 1 :(得分:1)
IIUC您可以为级别值添加新值,然后使用advanced indexing,get_level_values
,set_levels
和set_labels
方法更改索引的标签:
len_ind = len(mypiv.loc[(level_list,)].index.get_level_values(1))
mypiv.index.set_levels([1, 2, 3, np.nan], level=1, inplace=True)
mypiv.index.set_labels([3]*len_ind + mypiv.index.labels[1][len_ind:].tolist(), level=1, inplace=True)
In [219]: mypiv
Out[219]:
mylocation North South
scenario
this NaN 26 46
NaN 32 67
NaN 75 30
that NaN 52 92
NaN 64 134
NaN 150 60
something else 1.0 78 138
2.0 96 201
3.0 225 90
注意其他scenario
的值将转换为float,因为它应该是一种类型而np.nan
具有浮点类型。