Question

我有一个DataFrame，其中有三个索引如下：

                                               stat1             stat2
sample                        env  run                                                  
sample1                       0    0          36.214             71
                                   1          31.808             71
                                   2          28.376             71
                                   3          20.585             71
sample2                       0    0           2.059             29
                                   1           2.070             29
                                   2           2.038             29

这表示在不同数据样本上运行的进程。此过程在不同环境中运行多次，从而对结果进行限定。

这可能听起来很简单，但我在尝试将新环境结果添加为DataFrame时遇到了问题：

            stat1          stat2
run                                                  
0           0.686             29
1           0.660             29
2           0.663             29

这应该在df.loc[["sample1", 1]]下编入索引。我试过这个：

df.loc[["sample1", 1]] = result

使用DataFrame.append。但第一个只是提出KeyError而第二个似乎根本不会修改DataFrame。

我在这里缺少什么？

修改：在使用append之类的df.loc["sample"].append(result)时，问题在于它会混淆多索引。它被转换为单个索引，其中前一个多索引被合并为一个元组，如(0, 0)或(0, 1)代表环境0，运行1，依此类推;并且附加DataFrame的索引（表示每次运行的范围索引）成为新的不需要的索引。

Answer 1

这里问题的核心是索引的差异。克服这个问题的一种方法是更改结果的索引以包含要设置的0,1级别，然后使用concat附加数据帧。见下面的例子

In [68]: result.index = list(zip(["sample1"]*len(result), [1]*len(result),result
    ...: .index))

In [69]: df = pd.concat([df,result])
         df
Out[69]: 
                  stat1  stat2
sample  env run               
sample1 0   0    36.214     71
            1    31.808     71
            2    28.376     71
            3    20.585     71
sample2 0   0     2.059     29
            1     2.070     29
            2     2.038     29
sample1 1   0     0.686     29
            1     0.660     29
            2     0.663     29

编辑：索引更改后，您甚至可以使用追加

In [21]: result.index = list(zip(["sample1"]*len(result), [1]*len(result),result
    ...: .index))

In [22]: df.append(result)
Out[22]: 
                  stat1  stat2
sample  env run               
sample1 0   0    36.214     71
            1    31.808     71
            2    28.376     71
            3    20.585     71
sample2 0   0     2.059     29
            1     2.070     29
            2     2.038     29
sample1 1   0     0.686     29
            1     0.660     29
            2     0.663     29

将DataFrame附加到多索引DataFrame

1 个答案: