Question

我试图了解如何在数据框中的'groupby'或每组中应用函数。

import pandas as pd
import numpy as np
df = pd.DataFrame({'Stock' : ['apple', 'ford', 'google', 'samsung','walmart', 'kroger'],
                   'Sector' : ['tech', 'auto', 'tech', 'tech','retail', 'retail'],
                   'Price': np.random.randn(6),
                   'Signal' : np.random.randn(6)},  columns= ['Stock','Sector','Price','Signal'])
dfg = df.groupby(['Sector'],as_index=False)

type(dfg)
pandas.core.groupby.DataFrameGroupBy

我想通过'Sector'获得总和（Price *（1 / Signal））组。即结果输出应该看起来像

Sector  |   Value

auto    | 0.744944

retail  |-0.572164053

tech    | -1.454632

我可以通过创建单独的数据框来获得结果，但是正在寻找一种方法弄清楚如何操作每个分组（扇区）帧。

我可以找到价格的平均值或总和

dfg.agg({'Price' : [np.mean, np.sum] }).head(2)

但不能得到总和（价格*（1 /信号）），这就是我需要的。

谢谢，

Answer 1

您提供了随机数据，因此我们无法获得您获得的确切数字。但根据您刚才描述的内容，我认为以下内容将会：

In [121]:

(df.Price/df.Signal).groupby(df.Sector).sum()
Out[121]:
Sector
auto     -1.693373
retail   -5.137694
tech     -0.984826
dtype: float64

DataFrameGroupBy中的操作

1 个答案: