所以我有这样的df:
NAME TRY SCORE
Bob 1st 3
Sue 1st 7
Tom 1st 3
Max 1st 8
Jay 1st 4
Mel 1st 7
Bob 2nd 4
Sue 2nd 2
Tom 2nd 6
Max 2nd 4
Jay 2nd 7
Mel 2nd 8
Bob 3rd 3
Sue 3rd 5
Tom 3rd 6
Max 3rd 3
Jay 3rd 4
Mel 3rd 6
我想算一下每个人得分超过5的haw mant次数? 进入一个看起来像这样的新df2:
NAME COUNT
Bob 0
Sue 1
Tom 2
Mary 1
Jay 1
Mel 3
我的尝试很多 - 这是最新的
df2 = df.groupby('NAME')[['SCORE'] > 5].count().reset_index(name="count")
答案 0 :(得分:3)
首先创建布尔掩码然后<?php
//begin of singleton
require 'AltoRouter.php';
$router = new AltoRouter();
$router->map('GET', '/', function () {
require '../app/home/controllers/homecontroller.php';
});
//end of singleton
$match = $router->match();
if ($match && is_callable($match['target'])) {
call_user_func_array($match['target'], $match['params']);
} else {
// no route was matched
header($_SERVER["SERVER_PROTOCOL"] . ' 404 Not Found');
}
?>
aggregate
- sum
的值是True
之类的进程:
1
<强>详细强>:
df2 = (df['SCORE'] > 5).groupby(df['NAME']).sum().astype(int).reset_index(name="count")
print (df2)
NAME count
0 Bob 0
1 Jay 1
2 Max 1
3 Mel 3
4 Sue 1
5 Tom 2
答案 1 :(得分:3)
只需使用groupby
和sum
df.assign(SCORE=df.SCORE.gt(5)).groupby('NAME')['SCORE'].sum().astype(int).reset_index()
Out[524]:
NAME SCORE
0 Bob 0
1 Jay 1
2 Max 1
3 Mel 3
4 Sue 1
5 Tom 2
或者我们将set_index
与sum
df.set_index('NAME').SCORE.gt(5).sum(level=0).astype(int)
答案 2 :(得分:1)
这样做的一种方法是编写一个自定义的groupby函数,你可以在其中获取每个组的分数,并总结大于5的那些:
df.groupby('NAME')['SCORE'].agg(lambda x: (x > 5).sum())
NAME
Bob 0
Jay 1
Max 1
Mel 3
Sue 1
Tom 2
Name: SCORE, dtype: int64
答案 3 :(得分:0)
如果您想将计数作为字典,可以使用git add
:
collections.Counter
对于数据框,您可以映射唯一名称的计数:
from collections import Counter
c = Counter(df.loc[df['SCORE'] > 5, 'NAME'])
答案 4 :(得分:0)
首先过滤数据帧,然后使用聚合和重新索引进行groupby以填充缺失值。
df[df['SCORE'] > 5].groupby('NAME')['SCORE'].size()\
.reindex(df['NAME'].unique(), fill_value=0)
输出:
NAME
Bob 0
Sue 1
Tom 2
Max 1
Jay 1
Mel 3
Name: SCORE, dtype: int64