Pandas来自groupby的累积差异

时间:2018-05-02 16:13:37

标签: python pandas dataframe group-by pandas-groupby

我需要计算从MultiIndex级别开始的差异,以计算从级别开始的衰减。我的示例输入和输出将如下所示:

df.groupby(level=0)['values'].diff()

我可以使用grouby来获得级别中连续单元格之间的差异:

arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
          np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)

   bar one    1000
       two     800
       three   500
   foo one     800
       two     400
       three   200

    expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)

   bar one      Nan
       two     -200
       three   -500
   foo one     Nan 
       two     -400
       three   -600

但那不是我想要的!

唉,接受的答案并不是我想要的。我有一个更好的例子:

df.groupby(level=0).diff().cumsum()

但是pd.DataFrame([Nan, -200, -500, Nan, -900, -1100], index=arrays) bar one Nan two -200 three -500 foo one Nan two -900 three -1100 的结果给出了:

            await mongoContext.Clients.UpdateOneAsync(x => x.Id == model.PostId,
           Builders<Client>.Update.Inc(x => x.Orders.ElementAt(index).Stars, 1));

2 个答案:

答案 0 :(得分:3)

您是否正在寻找cumsum之后?

df.groupby(level=0)['values'].diff().cumsum()

答案 1 :(得分:1)

你可以通过链接另一个groupby来获得我想要的东西:

arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
      np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)

   bar one    1000
       two     800
       three   500
   foo one     800
       two     400
       three   200

    expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)

df.groupby(level=0).diff().groupby(level=0).cumsum()

    bar one      Nan
       two     -200
       three   -500
    foo one     Nan 
       two     -400
       three   -600