Question

我有一个包含1703行的多索引数据框（pivottable），如下所示：

Local code     Ex Code    ...  Value      
159605         FR1xx      ...  30               
159973         FR1xx      ...  50    
...
ZZC923HDV906   XYxx       ...  20

有数字本地代码（例如159973）或由字符和字符串组成的本地代码（例如ZZC923HDV906）我想通过第一个索引列（本地代码）选择数据。这适用于具有以下代码的字符串字符

pv_comb[(pv_comb.index.get_level_values("Local code") == "ZZC923HDV906")]

但是我无法选择数值：

pv_comb[(pv_comb.index.get_level_values("Local code") == 159973)]

这将返回一个空数据帧。是否可以将多索引的第一列中的值转换为字符串字符，然后选择数据？

Answer 1

IIUC您需要''，因为您的numeric值为strings - 所以159973更改为'159973'：

pv_comb[(pv_comb.index.get_level_values("Local code") == '159973')]

如果需要将某些级别的MultiIndex转换为string，则需要创建新索引，然后分配：

#if python 3 add list
new_index = list(zip(df.index.get_level_values('Local code').astype(str),
                df.index.get_level_values('Ex Code')))

df.index = pd.MultiIndex.from_tuples(new_index, names = df.index.names)

也可以有一些whitespaces，您可以strip删除它们：

#change multiindex
new_index = zip(df.index.get_level_values('Local code').astype(str).str.strip(),
                df.index.get_level_values('Ex Code')
df.index = pd.MultiIndex.from_tuples(new_index, names = df.index.names)

如果许多级别首先是reset_problematic级别，请执行操作并设置索引。那么可能sortlevel是必要的：

df = pd.DataFrame({'Local code':[159605,159973,'ZZC923HDV906'],
                   'Ex Code':['FR1xx','FR1xx','XYxx'],
                   'Value':[30,50,20]})
pv_comb = df.set_index(['Local code','Ex Code'])
print (pv_comb)
                      Value
Local code   Ex Code       
159605       FR1xx       30
159973       FR1xx       50
ZZC923HDV906 XYxx        20

pv_comb = pv_comb.reset_index('Local code')
pv_comb['Local code'] = pv_comb['Local code'].astype(str)
pv_comb = pv_comb.set_index('Local code', append=True).swaplevel(0,1)
print (pv_comb)
                      Value
Local code   Ex Code       
159605       FR1xx       30
159973       FR1xx       50
ZZC923HDV906 XYxx        20

从pandas multiindex pivottable中选择数据

1 个答案: