Question

在python中，我有一个熊猫数据框df，如下所示：

 ID      Geo    Speed
123    False       40
123     True       90
123     True       80
123    False       50
123     True       10
456    False       10
456     True       90
456    False       40
456     True       80

我想按df对ID进行分组，并过滤掉Geo == False处的行，并得到该分组中Speed的均值。因此结果应如下所示。

 ID     Mean 
123       60  
456       85

我的尝试：

df.groupby('ID')["Geo" == False].Speed.mean()
df.groupby('ID').filter(lambda g: g.Geo == False)
df[df.Geo.groupby(df.ID) == False]

他们俩都没有工作。有什么办法吗？谢谢！

Answer 1

使用<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.1.0/css/bootstrap.min.css"> <div class="card text-white bg-info mx-auto text-center" style="max-width: 18rem;"> <div class="card-header">Header</div> <div class="card-body"> <h5 class="card-title text-center">Info card title</h5> <p class="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p> </div> </div> </div>将~转换为False，以boolean indexing过滤True：

False

并按print (df[~df["Geo"]]) ID Geo Speed 0 123 False 40 3 123 False 50 5 456 False 10 7 456 False 40 df = df[~df["Geo"]].groupby('ID', as_index=False).Speed.mean() print (df) ID Speed 0 123 45 1 456 25 s进行过滤：

True

Answer 2

通过使用pivot_table，现在您会得到True和False均值

df.pivot_table('Speed','ID','Geo',aggfunc='mean')
Out[154]: 
Geo  False  True 
ID               
123     45     60
456     25     85

熊猫：分组依据，过滤行，获取均值

2 个答案: