假设我有以下面板数据框(可重现的玩具示例如下):
ID <- c(12232,12232,12232,12232,12232,14452,14452,14452)
Time <- c(1,2,3,4,5,1,2,3)
y1 <- c(2.3,7.8,4.5,3.4,2.3,1.2,0.5,1.9)
State <- c("a","a","a","b","a","c","c","b")
DataFrame <- cbind(ID,Time,y1,State)
我愿意 我想知道是否有某种方法可以识别状态(状态)之间转换的个体以及它们发生时的观察结果。 期望的输出:一个数据帧,产生在转换时状态和y1之间转换的个体的ID,例如,某些内容
ID transition y1
12232 a -> b 4.5
12232 b -> a 3.4
14452 c -> b 0.5
当然转换字段不需要具有那种格式...... ab和bc也可以正常工作,重要的是 它按组(ID,因为它是面板数据)工作,并匹配状态之间的转换以及它们发生时的特征。
非常感谢,这个网站已经挽救了我很多次的生命:)
答案 0 :(得分:1)
使用dplyr
的快速回答是
library(dplyr)
DataFrame <- data_frame(ID,Time,y1,State)
output <- DataFrame %>%
group_by(ID) %>% # group the data by ID
mutate(StateL = lead(State)) %>% # create a lead variable called StateL
filter(State != StateL) %>% # subset the case where the state change at t+1
mutate(transState = paste(State, "->", StateL)) %>% # crate a variable transState
select(c(ID, transState, y1)) ## select the vaiables to keep
output
## # A tibble: 3 x 3
## # Groups: ID [2]
## ID transState y1
## <dbl> <chr> <dbl>
## 1 12232 a -> b 4.5
## 2 12232 b -> a 3.4
## 3 14452 c -> b 0.5
##
答案 1 :(得分:0)
使用data.table
:
library(data.table)
df <- data.frame(DataFrame)
setDT(df)
df[, lead := shift(State, type = "lead"), by = ID]
df[State != lead, transition := paste0(State, " -> ", lead)]
df <- df[!(is.na(transition)), ]
df <- df[, c("ID", "transition", "y1")]
输出:
ID transition y1
1: 12232 a -> b 4.5
2: 12232 b -> a 3.4
3: 14452 c -> b 0.5