我想在一个列中但在分组列定义的组中重新排列因子的水平。
简单的示例数据集:
df <- structure(list(a_factor = structure(1:6, .Label = c("a", "b",
"c", "d", "e", "f"), class = "factor"), group = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("group1", "group2"), class = "factor"),
value = 1:6), class = "data.frame", row.names = c(NA, -6L
))
> df
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6
更准确地说,如何重新排序因子水平,例如在value
处下降df$group == "group1"
,在value
处上升df$group == "group2"
,最好在dplyr中上升?
预期输出可能是:
> df
a_factor group value
1 c group1 3
2 b group1 2
3 a group1 1
4 d group2 4
5 e group2 5
6 f group2 6
尽管如此,问题更普遍的是如何在dplyr中解决此问题。
答案 0 :(得分:2)
我们可以根据组值取反,然后订购:
df %>%
arrange(case_when(
group == "group1" ~ -value,
group == "group2" ~ value))
# a_factor group value
# 1 c group1 3
# 2 b group1 2
# 3 a group1 1
# 4 d group2 4
# 5 e group2 5
# 6 f group2 6
答案 1 :(得分:2)
以下是基本的R解决方案。
sp <- split(df$value, df$group)
sp <- lapply(seq_along(sp), function(i) sort(sp[[i]], decreasing = i == 1))
df$a_factor <- factor(df$a_factor, levels = df$a_factor[unlist(sp)])
df$a_factor
#[1] a b c d e f
#Levels: c b a d e f
df[order(df$a_factor), ]
# a_factor group value
#3 c group1 3
#2 b group1 2
#1 a group1 1
#4 d group2 4
#5 e group2 5
#6 f group2 6
答案 2 :(得分:1)
一种选择是执行group_split
并传递list
对应于arrange
需要如何执行的逻辑值
library(tidyverse)
df %>%
group_split(group) %>%
map2_df(., list(FALSE, TRUE), ~ if(.y) .x %>%
arrange(value) else .x %>% arrange(desc(value)))
# A tibble: 6 x 3
# a_factor group value
# <fct> <fct> <int>
#1 c group1 3
#2 b group1 2
#3 a group1 1
#4 d group2 4
#5 e group2 5
#6 f group2 6
答案 3 :(得分:1)
要重新排序因子水平,可以使用forcats
(tidyverse
的一部分),并执行类似的操作...
library(forcats)
df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,
value*(-1 + 2 * (group=="group1"))))
levels(df2$a_factor)
[1] "f" "e" "d" "a" "b" "c"
这不会重新排列数据框本身...
df2
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6