可能数据如下:
user RO retweetID
jim o (null)
jim o (null)
jim r r8
bill o (null)
bill r r3
fred o (null)
fred r r6
fred r r6
fred r r1
我想计算o,r&av;和avgercount(r的总数/不同的r的数量) 所以我应该得到:
user ocount rcount avgercount
jim 2 1 1
bill 1 1 1
fred 1 3 1.5
我被困在SQLFiddle的开始。任何帮助非常感谢。
EDIt:澄清: avgercount =(r的总数/不同的r的数量) 对于弗雷德来说,他有三个转发:r6,r6,r1,但只有两个是不同的,avgercount = 3/2。
答案 0 :(得分:3)
只需使用条件聚合来获取基数:
select user,
sum(case when ro = 'o' then 1 else 0 end) as o_cnt,
sum(case when ro = 'r' then 1 else 0 end) as r_cnt,
avg(case when ro = 'r' then 1.0 else 0.0 end) as avg_r,
sum(case when ro = 'r' then 1.0 else 0 end) / count(distinct case when ro = 'r' then retweetID end) as retweet_ratio
from t
group by user;
计算“r”的平均值。目前还不清楚你的问题在计算中是什么。
在MySQL中,这可以缩短为:
select user,
sum( ro = 'o' ) as o_cnt,
sum( ro = 'r' ) as r_cnt,
sum( ro = 'r' ) / count(distinct case when ro = 'r' then retweetID end) as retweetid_ratio
from t
group by user;