我想根据前一列创建一个名为age的新列,并按功能分组。数据集如下:
tid<- c(1,2,3,4, 1,2,3,4,1,2,3,4)
active<- c(0,1,0,4, 0,0,0,1,0,0,1,0)
person<- c('John', 'John','John', 'John', 'Emma', 'Emma','Emma','Emma', 'Edward', 'Edward', 'Edward', 'Edward')
df<- data.frame(tid, active, person)
我想创建一个年龄,该年龄是在该人首次激活时从0开始的,即,active的值首次大于0,然后为下一条记录递增一个值。有什么建议么?
我期望输出如下:
name age
John 0
John 0
John 1
John 2
Emma 0
Emma 0
Emma 0
Emma 0
Edward 0
Edward 0
Edward 0
Edward 1
答案 0 :(得分:2)
这能为您解决吗?
library(dplyr)
df %>%
group_by(person) %>%
arrange(person, tid) %>%
mutate(active_dummy = if_else(lag(cumsum(active)) > 0, 1, 0, 0),
age = cumsum(active_dummy)) %>%
select(person, age)
给你
# A tibble: 12 x 2
# Groups: person [3]
person age
<chr> <dbl>
1 John 0.
2 John 0.
3 John 1.
4 John 2.
5 Emma 0.
6 Emma 0.
7 Emma 0.
8 Emma 0.
9 Edward 0.
10 Edward 0.
11 Edward 0.
12 Edward 1.
答案 1 :(得分:0)
另一种可以解决问题的解决方案:
library(tidyverse)
age_counter = df %>%
arrange(tid) %>%
group_by(person) %>%
filter(cumsum(active) > 0) %>%
mutate(age = row_number() - 1)
df %>%
left_join(age_counter) %>%
replace_na(list(age = 0)) %>%
select(person, age)