我正在尝试计算每个ID的柱岸从西向东变化的次数,反之亦然。这是我数据框的子集
structure(list(ID = c(30767L, 30767L, 30767L, 30767L, 30767L,
30767L, 30767L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L,
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L,
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L,
30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L, 30759L,
30759L, 30759L, 30759L), shore = c("West", "West", "West", "West",
"West", "West", "West", "West", "West", "West", "West", "West",
"East", "West", "East", "East", "West", "West", "West", "West",
"West", "West", "West", "West", "West", "West", "East", "West",
"West", "West", "West", "West", "East", "East", "East", "East",
"East", "East", "East", "East")), row.names = c(NA, -40L), groups = structure(list(
ID = c(30759L, 30767L), .rows = list(8:40, 1:7)), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
基本上,我首先要做的是将任何变化都识别为-东西向运动为0,东西向运动为1.。请参见下面的示例。
ID Shore Direction
1 30759 West -
2 30759 West -
3 30759 West -
4 30759 East 0
5 30759 West 1
6 30759 East 0
7 30759 East -
8 30759 West 1
9 30759 West -
10 30759 West -
答案 0 :(得分:1)
按ID
分组,然后将case_when
与lag
一起使用以计算变量。
library(dplyr)
DF %>%
group_by(ID) %>%
mutate(dir = case_when(
shore == "West" & lag(shore) == "East" ~ 1L,
shore == "East" & lag(shore) == "West" ~ 0L,
TRUE ~ NA_integer_)) %>%
ungroup
答案 1 :(得分:1)
这是使用dplyr
的一种方法:
df %>%
dplyr::mutate(prev = lag(shore),
direction = dplyr::case_when(shore == "West" & prev == "East" ~ 1,
shore == "East" & prev == "West" ~ 0,
TRUE ~ NA_real_))
lag()
函数提供shore
列的上一个条目(在这种情况下)。然后,我添加了一个方向列,当方向从东向西变化时为1
,当方向从西向东变化时为0
,否则为NA
。然后,您可以删除prev
列。