如何创建将分段函数应用于另一列的结果的列?

时间:2019-02-06 06:01:29

标签: r dplyr

这是一个简短的例子,我将用来解决我的问题:

river_tibble <- tibble(
  river_name = c("River 1", "River 2", "River 3", "River 4", "River 5"),
  river_width_1 = c(418, 1264, 744, 3403, 2089)
)

我有一个分段功能。例如,我想在自己的标题中添加一个新列(我假设使用mutate变体?),这是将river_width_1值映射到river_width_2值的函数的结果如下:

river_width_1 == 0 -> 0,
0 < river_width_1 < 500 -> 1,
500 <= river_width_1 < 1000 -> 2,
100 <= river_width_1 < 2000 -> 3,
200 <= river_width_1 -> 4

因此,此示例的最终结果将如下所示:

river_tibble <- tibble(
  river_name = c("River 1", "River 2", "River 3", "River 4", "River 5"),
  river_width_1 = c(418, 1264, 744, 3403, 2089),
  river_width_2 = c(1, 3, 2, 4, 4)
)

是否有一种方法可以使用dplyr使用该组条件构造river_width_2

1 个答案:

答案 0 :(得分:1)

我们可以使用cut

library(dplyr)

river_tibble %>% 
   mutate(river_width_2 = as.integer(cut(river_width_1, c(-Inf, 500, 1000, 2000, Inf))))

# river_name  river_width_1 river_width_2
#  <chr>              <dbl>         <int>
#1 River 1              418             1
#2 River 2             1264             3
#3 River 3              744             2
#4 River 4             3403             4
#5 River 5             2089             4

或与case_when

river_tibble %>%
  mutate(river_width_2 = case_when(river_width_1 == 0 ~ 0 , 
                                   river_width_1 > 0 & river_width_1 < 500 ~ 1, 
                                   river_width_1 > 500 & river_width_1 < 1000 ~ 2, 
                                   river_width_1 > 1000 & river_width_1 < 2000 ~ 3, 
                                   TRUE ~ 4))