Question

TLDR：如何通过任何切片在多级列表中设置值。我让它在最外面的切片上工作，但如果你沿着＆＃34;中间＆＃34;

假设您有2层或3层多索引系列，如下所示：

以下是我目前正在尝试做的事情：

_s01_|_s02_|_s03_|____
 'a' | 'c' | 'n' | 0.0
           | 'm' | 0.1
           | 'o' | 0.2
     | 'd' | 'n' | 0.3
           | 'o' | 0.4
 'b' | 'c' | 'n' | 0.5
        .........

这只会将切成的所有r = pd.Series(0,index - data.index) #so create a similar structure for i in data.index.levels[1]: d = data.loc[(slice(None),i,slice(None)] #manipulate values in d r.loc[(slice(None),i,slice(None)] = d值设置为r。

是否有通用的方法来查看多级索引系列和设置值？我正在尝试与DataFrame非常相似的东西，导致同样问题的问题是NaN正在降低级别，然后索引不相同。我通过将语法修改为现在尝试用于系列的语法来解决问题。

任何帮助都会得到很大的帮助

Answer 1

Pandas建议使用pd.IndexSlice或类似语法而不是slice（）。（查看更多documentation on slicers here.），例如

明确地：

idx = pd.IndexSlice
series.loc[idx[:, 'c', :]]

如果您只想尝试获取所选行的完整条目，可以省略idx步骤快捷方式：series.loc[:, 'c', :]（它基本上会发生什么简单的索引。）

然而， 使用pd.IndexSlice更好，如果您尝试在Dataframe中编入索引，则需要更多。

说我们有你的系列

series

>  s01  s02  s03
a    c    n      1
          m      0
          o      4
     d    n      6
          o      9
b    c    n      4
dtype: float64

对pd.Series和pd.Dataframe

中的多级索引建立索引

关键部分

要进行索引编制，我们需要首先列出系列索引：

series.sort_index（inplace = True）

然后，要进行任何索引，我们需要一个pd.IndexSlice对象，该对象通过以下方式定义.loc的选择：

idx = pd.IndexSlice
# do your indexing
series.loc[idx[:,'c',:]]

详细信息

多级索引的索引在没有pd.IndexSlice的情况下不起作用：

在系列赛中：

series.loc[[:,'c',:]]` will give you:

File "<ipython-input-101-21968807c1d1>", line 1
    df.loc[[:,'c',:]]
        ^
SyntaxError: invalid syntax


# with IndexSlice
idx = pd.IndexSlice
series.loc[idx[:,'c',:]]

>  s01  s03
a    n      1
     m      0
     o      4
b    n      4
dtype: int64

如果我们有pd.DataFrame，我们会做类似的事情。

假设我们有以下pd.Dataframe：

df
>              hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda     like
        o        4  Tiger     like
    d   n        6  Goose     like
        o        9   Bear  dislike
b   c   n        4   Dog  dislike

索引：

df.sort_index(inplace = True) # need to lexsort for indexing

# without pd.IndexSlice
df.loc[:,'c',:]   # the whole entry 
File "<ipython-input-118-9544c9b9f9da>", line 1
df.loc[(:,'c',:)]
        ^
SyntaxError: invalid syntax

# with pd.IndexSlice
idx = pd.IndexSlice
df.loc[idx[:,'c',:],:]

>             hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda     like
        o        4  Tiger     like
b   c   n        4   Dog  dislike

和特定列

df.loc[idx[:,'d',:],['hello','animal']]

>              hello animal
s01 s02 s03              
a   d   n        6  Goose
        o        9   Bear

设定值

如果您想在选择中设置值，可以按照惯例进行设置：

对于系列赛：

my_select = series.loc[idx[:,'c',:],:]
series.loc[idx[:,'c',:]] = my_select.apply(lambda x: x*3)

series
> s01  s02  s03
a    c    m       0
          n       3
          o      12
     d    n       6
          o       9
b    c    n      12
dtype: int64

对于Dataframe：

my_select = df.loc[idx[:,'d',:],:]
df.loc[idx[:,'d',:],['i_like']] = my_select.apply(
      lambda x: "dislike" if x.hello<5 else "like", axis=1)

df
>             hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda  dislike
        o        4  Tiger     like
    d   n        6  Goose     like
        o        9   Bear  dislike
b   c   n        4   Dog     like

# Panda is changed to "dislike", and Dog to "like".

PS。注意逗号/冒号（或缺少逗号）！

希望这有帮助！

设置Multiindex系列的多个图层

1 个答案:

对pd.Series和pd.Dataframe

关键部分

详细信息

设定值