此任务看起来很简单,但在查看 stackoverflow 中发布的多个答案后,我仍然无法得到正确的答案,所以我需要帮助。我一直在研究这篇文章: How to rank within groups in R?
我有来自实验的数据,其中收集了多个变量,我需要对每个实验条件的某些产品的性能进行排名。以下是ranking
列的预期输出示例。
id customer location fluid water temperature speed time product response ranking
1 103365333 Acme International Newtown US light fluid 5 105 8 2 AK125 25.94 1
2 103365333 Acme International Newtown US light fluid 5 105 8 2 AK560 25.19 2
3 103365333 Acme International Newtown US light fluid 5 105 8 2 PR600 24.56 3
4 103365333 Acme International Newtown US light fluid 5 105 8 2 PR300 23.69 4
5 103365333 Acme International Newtown US light fluid 5 105 8 2 XY500 23.63 5
6 103365333 Acme International Newtown US light fluid 5 105 8 2 XYZ123 22.75 6
7 103365333 Acme International Newtown US light fluid 5 105 8 2 ABC567 21.50 7
8 103365333 Acme International Newtown US light fluid 5 105 8 2 Z12345 21.50 8
9 103365333 Acme International Newtown US light fluid 5 105 8 2 W21450 21.00 9
10 103365333 Acme International Newtown US light fluid 5 105 8 2 W21010 20.54 10
11 103365333 Acme International Newtown US heavy fluid 5 105 8 2 W20001 19.06 11
12 103365333 Acme International Newtown US heavy fluid 5 105 8 2 W22025 15.88 12
13 155259007 New Great Company Ghosttown CA residue good 10 105 8 2 AK125 13.52 1
14 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 AK560 8.75 1
15 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 PR600 6.00 2
16 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 PR300 1.50 3
17 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 XY500 1.50 4
18 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 XYZ123 14.25 1
19 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 ABC567 13.25 2
20 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 Z12345 12.88 3
我的目标是按product
按降序排列response
,如expected output所示。并非所有product
都用于所有实验,这使得它变得棘手。
我正在尝试我的“标准”代码管道:
df %>%
arrange(id, customer, location, fluid, water, temperature, speed, time, -response) %>%
group_by(id, customer, location, fluid, water, temperature, speed, time) %>%
mutate(ranking = dense_rank(response))
但我得到的只是整体排名,而不是每组。您是否看到我的代码有任何问题,或者group_by
中要使用的变量数量有限制?我也尝试了其他排名函数(虽然都基于rank
)。感谢。