Question

在以下DataFrame中，即df1：

In[0]: df1
Out[0]:
                     A         B
first second                    
bar   one     1.764052  0.400157
      one     0.978738  2.240893
      one     1.867558 -0.977278
      two     0.950088 -0.151357

我希望在MultiIndex DataFrame的最后一行之后追加另一个('bar','one') ，同时为新添加的行提供相同的MultiIndex 。

即，对于以下df2：

In[1]: df2 Out[1]: A B first second baz three -0.103219 0.410599 three 0.144044 1.454274

期望的结果是：

A B first second bar one 1.764052 0.400157 one 0.978738 2.240893 one 1.867558 -0.977278 one -0.103219 0.410599 # there 2 rows one 0.144044 1.454274 # arrived from df2 two 0.950088 -0.151357

到目前为止是个问题。

我未能成功的一些尝试：

（1）按组进行迭代（使用groupby）并根据df2值汇总新的DataFrame：

for idx, data in df1.groupby(level=[0,1]): df1.loc[idx] = pd.concat([data, pd.DataFrame(df2, index=idx)], ignore_index=True) Exception: cannot handle a non-unique multi-index!

（还尝试将它们放入新的DataFrame）。
先前
（2）reindexing df2：

for idx, data in df1.groupby(level=[0,1]): df2.reindex(idx) Exception: cannot handle a non-unique multi-index!

或者：

for idx, data in df1.groupby(level=[0,1]): df2.index = idx break A B bar -0.103219 0.410599 one 0.144044 1.454274

Answer 1

如果要手动将数据插入现有数据框，则需要确定一些事项。

你要在哪里插入？我通过查找索引为('bar', 'one')的第一个实例来解决这个问题。
你打算怎么称呼数据？换句话说，您要插入的数据的索引是什么？显然，您正在更改索引值。您必须提前知道这些索引值是什么。除非你希望它继承它前面的行的索引值（我也会这样说）。

position = (df1.index.to_series() == ('bar', 'two')).values.argmax()

pd.concat([
        df1.iloc[:position],
        df2.set_index([['bar', 'bar'], ['one', 'one']]),
        df1.iloc[position:]
    ])

新行从前一行继承索引值的示例（结果与上面相同）

position = (df1.index.to_series() == ('bar', 'two')).values.argmax()
insert_idx = pd.MultiIndex.from_tuples(df1.index[[position - 1]].tolist() * len(df2))

pd.concat([
        df1.iloc[:position],
        df2.set_index(insert_idx),
        df1.iloc[position:]
    ])

将DataFrame附加到特定MultiIndex

1 个答案: