Question

我有以下数据：

library(dplyr)

d <- tibble(
region = c('West Duns', 'West Alpha', 'East fun', 'East Hull',
         'Jess One', 'Jess Two'),
 figures= c(5, 7, 4, 8, 7, 6))

我希望数据看起来像这样：

 d <- tibble(
   region = c('Jess One', 'Jess Two','West Alpha', 'West Duns' 
            'East Fun', 'East Hull'),
figures= c(7, 6, 5, 7, 4, 8))

我知道我可以使用：

 d %>%
 arrange(factor(.$region, levels = c('Jess One', 'Jess Two','West Alpha', 
'West Duns' ,'East Fun', 'East Hull'))) -> d2

但是，当我有非常长的数据意味着要花很多时间才能输入所有因素时会发生什么？我想在参数中使用case when和%like%。所以会是这样：

 d2 %>%
 arrange(factor(.$region, case_when (levels = c(%like% "Jess", %like% 
 "West", %like% "East"))) -> d2

因此，因子按语句中给出的第一个单词排列，然后按因子中的第二个单词按字母顺序排列。我认为字母顺序自然会发生，所以更多的是如何使用需要帮助的case_when和%like参数。

谢谢

Answer 1

这是使用stringr::word和forcats::fct_relevel的一个选项

library(tidyverse)
d %>% group_by(reg=word(region)) %>% ungroup() %>% 
      mutate(reg_fac=fct_relevel(reg,'Jess','West','East')) %>% arrange(reg_fac)

我们可以简短地

library(tidyverse)
d %>% arrange(fct_relevel(word(region),'Jess','West','East')) 

# A tibble: 6 x 2
   region     figures
    <chr>        <dbl>
1 Jess One         7
2 Jess Two         6
3 West Duns        5
4 West Alpha       7
5 East fun         4
6 East Hull        8

使用base::factor

d %>% dplyr::arrange(factor(gsub('(.*)\\s.*','\\1',.$region), 
                            levels = c('Jess','West','East')))

Answer 2

<link rel="preload" href="/wp-includes/css/dashicons.min.css?ver=5.1.1" as="style">

如果d %>% arrange(max.col(outer(region, c('Jess', 'West', 'East'), startsWith))) # # A tibble: 6 x 2 # region figures # <chr> <dbl> # 1 Jess One 7 # 2 Jess Two 6 # 3 West Duns 5 # 4 West Alpha 7 # 5 East fun 4 # 6 East Hull 8 i以名称j开头，则outer的{{1}}和使用region的名称向量将给出一个矩阵，其i-j元素为startsWith。然后TRUE为每一行提供第一列的索引，region，即给定区域元素开始的名称矢量元素的索引。

如果您要搜索整个字符串而不只是开始，可以将max.col替换为TRUE。

r case_when，安排和％like％

2 个答案: