在数据框中创建一个排名列,混合R中的另外两列

时间:2016-02-02 12:12:32

标签: r dataframe rank

我想在数据框上添加一个排名列,我的数据框如下所示:

df <- data.frame(category = rep(c('c1','c2','c3'), each =3),
             id = seq(1:9),
             count = c(10,10,10,9,8,8,7,6,4))

我想要的是:

category id count rank
c1       1   10    9
c2       4   9     8
c3       7   7     7
c1       2   10    6
c2       5   8     5
c3       8   6     4
c1       3   10    3
c2       6   8     2
c3       9   4     1

我希望排名基于不同的类别,然后是他们的数量,从高到低。

3 个答案:

答案 0 :(得分:3)

这是排名/订单组合的可能实现

library(data.table)
indx <- setDT(df)[, frank(-count, ties.method = "first"), by = category]$V1
df[order(indx)][, Rank := .N:1][]
#    category id count Rank
# 1:       c1  1    10    9
# 2:       c2  4     9    8
# 3:       c3  7     7    7
# 4:       c1  2    10    6
# 5:       c2  5     8    5
# 6:       c3  8     6    4
# 7:       c1  3    10    3
# 8:       c2  6     8    2
# 9:       c3  9     4    1

答案 1 :(得分:2)

这是一种方法。

虽然需要两个订单......

library(data.table)

df <- data.table(category = rep(c('c1','c2','c3'), each =3),
             id = seq(1:9),
             count = c(10,10,10,9,8,8,7,6,4))

setorder(df,category,-count)
df[,r1 := seq_len(.N),by=category]

setorder(df,r1)
df[,rank := rev(seq_len(.N))]

答案 2 :(得分:2)

我们可以尝试

library(data.table)
setDT(df)[order(-count), N:=1:.N, by = category]
df[order(N)][, rank:=.N:1][, N:= NULL][]
library(dplyr)
df %>%
   group_by(category) %>%
   arrange(desc(count)) %>%
   mutate(n = row_number()) %>%
   arrange(n) %>%
   ungroup() %>%
   mutate(rank = rev(row_number()))
#     category    id count     n  rank
#       (fctr) (int) (dbl) (int) (int)
#  1       c1     1    10     1     9
#  2       c1     2    10     2     8
#  3       c1     3    10     3     7
#  4       c2     4     9     1     6
#  5       c2     5     8     2     5
#  6       c2     6     8     3     4
#  7       c3     7     7     1     3
#  8       c3     8     6     2     2
#  9       c3     9     4     3     1