在前一个组中添加具有变量值的列

时间:2018-03-19 19:24:21

标签: r dplyr data.table

我有数据,如下所示,每个let favoriteSong: String? if let favoriteSong = favoriteSong { print("My favorite song is \(favoriteSong)") } else { print("I don't have a favorite song") } 每个id都有一个status。我想添加一个列time,在前一个prev_status中显示status的值。

time

我可以为每个人set.seed(10); library(dplyr); library(data.table) df <- data.table(time = sample(1:3, 20, T), status = sample(letters[1:15], 20, T) )[order(time) ][, id := 1:.N, by = time] time status id 1: 1 j 1 2: 1 g 2 3: 1 k 3 4: 1 m 4 5: 1 d 5 6: 1 c 6 7: 1 m 7 8: 1 o 8 9: 2 m 1 10: 2 l 2 11: 2 l 3 12: 2 f 4 13: 2 i 5 14: 2 b 6 15: 2 n 7 16: 2 g 8 17: 2 l 9 18: 2 k 10 19: 3 f 1 20: 3 h 2 执行此操作,并使用下面的联接(time = 2)。

time

有更好的方法可以为整个data.frame创建df1 <- df %>% filter(time == 1) df2 <- df %>% filter(time == 2) df2 %>% left_join(df1, by = 'id') %>% select(-time.y) %>% rename(status = status.x, prev_status = status.y, time = time.x) time status id prev_status 1 2 m 1 j 2 2 l 2 g 3 2 l 3 k 4 2 f 4 m 5 2 i 5 d 6 2 b 6 c 7 2 n 7 m 8 2 g 8 o 9 2 l 9 <NA> 10 2 k 10 <NA> 吗?我可以使用prev_statusdplyr解决方案(以及基础R)。

1 个答案:

答案 0 :(得分:1)

lag()具有df %>% arrange(time) %>% group_by(id) %>% mutate(prev_status=lag(status)) 功能,可以轻松实现

thread::create