如何基于另一列中的更改创建二进制变量?

时间:2019-07-11 14:56:21

标签: r

我正在尝试计算每个ID的柱岸从西向东变化的次数,反之亦然。这是我数据框的子集

structure(list(ID = c(30767L, 30767L, 30767L, 30767L, 30767L, 
30767L, 30767L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 
30759L, 30759L, 30759L), shore = c("West", "West", "West", "West", 
"West", "West", "West", "West", "West", "West", "West", "West", 
"East", "West", "East", "East", "West", "West", "West", "West", 
"West", "West", "West", "West", "West", "West", "East", "West", 
"West", "West", "West", "West", "East", "East", "East", "East", 
"East", "East", "East", "East")), row.names = c(NA, -40L), groups = structure(list(
    ID = c(30759L, 30767L), .rows = list(8:40, 1:7)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

基本上,我首先要做的是将任何变化都识别为-东西向运动为0,东西向运动为1.。请参见下面的示例。

      ID Shore Direction
1  30759  West         -
2  30759  West         -
3  30759  West         -
4  30759  East         0
5  30759  West         1
6  30759  East         0
7  30759  East         -
8  30759  West         1
9  30759  West         -
10 30759  West         -

2 个答案:

答案 0 :(得分:1)

ID分组,然后将case_whenlag一起使用以计算变量。

library(dplyr)

DF %>%
  group_by(ID) %>%
  mutate(dir = case_when(
    shore == "West" & lag(shore) == "East" ~ 1L,
    shore == "East" & lag(shore) == "West" ~ 0L,
    TRUE ~ NA_integer_)) %>%
  ungroup

答案 1 :(得分:1)

这是使用dplyr的一种方法:

df %>% 
  dplyr::mutate(prev = lag(shore),
                direction = dplyr::case_when(shore == "West" & prev == "East" ~ 1,
                                             shore == "East" & prev == "West" ~ 0,
                                             TRUE ~ NA_real_))

lag()函数提供shore列的上一个条目(在这种情况下)。然后,我添加了一个方向列,当方向从东向西变化时为1,当方向从西向东变化时为0,否则为NA。然后,您可以删除prev列。