我有一个像这样的数据框
library(plyr)
library(dplyr)
ID <- c("ID001","ID002","ID003","ID004","ID005",
"ID006","ID007","ID008","ID009","ID010")
Fail <- c(3,7,2,3,9,7,3,2,3,9)
Pass <- c(0,0,4,26,1,1,3,0,9,9)
df <- data.frame(ID,Fail,Pass)
我添加了另一列来计算失败百分比
df$Fail_Percentage <- (df$Fail/(df$Fail+df$Pass))*100
现在,我订购此数据框并创建变量“Rank”
library(data.table)
df <- df%>%
arrange(-Fail_Percentage) %>%
mutate(Rank = rleid(Fail_Percentage))
df
我得到了这个输出
ID Fail Pass Fail_Percentage Rank
ID001 3 0 100.00000 1
ID002 7 0 100.00000 1
ID008 2 0 100.00000 1
ID005 9 1 90.00000 2
ID006 7 1 87.50000 3
ID007 3 3 50.00000 4
ID010 9 9 50.00000 4
ID003 2 4 33.33333 5
ID009 3 9 25.00000 6
ID004 3 26 10.34483 7
这里的问题是我以这种方式创建重复的排名。我希望通过优先选择“失败”来排名。
例如:ID001,ID002,ID008都有排名1,但我想通过优先顺序排列最高失败率排名。因此,ID002将具有等级1,ID001将具有等级2并且ID008将具有等级3.我想以这种方式这样做并且同样地对其他条目进行排名。
所需的输出将是
ID Fail Pass Fail_Percentage Rank
ID002 7 0 100.00000 1
ID001 3 0 100.00000 2
ID008 2 0 100.00000 3
ID005 9 1 90.00000 4
ID006 7 1 87.50000 5
ID010 9 9 50.00000 6
ID007 3 3 50.00000 7
ID003 2 4 33.33333 8
ID009 3 9 25.00000 9
ID004 3 26 10.34483 10
我们怎样才能更好地做到这一点?有人能帮我指出正确的方向吗?
答案 0 :(得分:3)
ID <- c("ID001","ID002","ID003","ID004","ID005",
"ID006","ID007","ID008","ID009","ID010")
Fail <- c(3,7,2,3,9,7,3,2,3,9)
Pass <- c(0,0,4,26,1,1,3,0,9,9)
df <- data.frame(ID,Fail,Pass)
df$Fail_Percentage <- (df$Fail/(df$Fail+df$Pass))*100
仅使用data.table
df <- setDT(df)[order(-Fail_Percentage, -Fail)][, Rank := 1:.N]
答案 1 :(得分:0)
您可以使用第二个参数来安排预定的订单:
library(dplyr)
df = structure(list(ID = structure(1:10, .Label = c("ID001", "ID002",
"ID003", "ID004", "ID005", "ID006", "ID007", "ID008", "ID009",
"ID010"), class = "factor"), Fail = c(3, 7, 2, 3, 9, 7, 3, 2,
3, 9), Pass = c(0, 0, 4, 26, 1, 1, 3, 0, 9, 9)), .Names = c("ID",
"Fail", "Pass"), row.names = c(NA, -10L), class = "data.frame")
df = df %>%
mutate(Fail_Percentage = Fail / (Fail + Pass) * 100) %>%
arrange(-Fail_Percentage, -Fail) %>%
mutate(Rank = order(-Fail_Percentage))
> df
ID Fail Pass Fail_Percentage Rank
1 ID002 7 0 100.00000 1
2 ID001 3 0 100.00000 2
3 ID008 2 0 100.00000 3
4 ID005 9 1 90.00000 4
5 ID006 7 1 87.50000 5
6 ID010 9 9 50.00000 6
7 ID007 3 3 50.00000 7
8 ID003 2 4 33.33333 8
9 ID009 3 9 25.00000 9
10 ID004 3 26 10.34483 10