计算每个组的排名

时间:2012-12-17 14:25:58

标签: r plyr ranking

我有df类型和值。我希望在x内按type的顺序对它们进行排名,并计算行n的其他行数比x的值高(列pos })。

e.g。

df <- data.frame(type = c("a","a","a","b","b","b"),x=c(1,77,1,34,1,8))
# for type a row 3 has a higher x than row 1 and 2 so has a pos value of 2

我可以这样做:

library(plyr)
df <- data.frame(type = c("a","a","a","b","b","b"),x=c(1,77,1,34,1,8))
df <- ddply(df,.(type), function(x) x[with(x, order(x)) ,])
df <- ddply(df,.(type), transform, pos = (seq_along(x)-1) )

     type  x pos
1    a  1   0
2    a  1   1
3    a 77   2
4    b  1   0
5    b  8   1
6    b 34   2

但是这种方法没有考虑类型a第1行和第2行之间的联系。最简单的方法是获得具有相同值的输出,例如。

     type  x pos
 1    a  1   0
 2    a  1   0
 3    a 77   2
 4    b  1   0
 5    b  8   1
 6    b 34   2

1 个答案:

答案 0 :(得分:8)

ddply(df,.(type), transform, pos = rank(x,ties.method ="min")-1)

  type  x pos
1    a  1   0
2    a 77   2
3    a  1   0
4    b 34   2
5    b  1   0
6    b  8   1