我正在使用R并拥有如下所示的长数据集:
Date ID Status
2014-10-01 12 1
2015-04-01 12 1
2015-07-01 12 1
2015-09-01 12 1
2015-11-01 12 0
2016-01-01 12 0
2016-05-01 12 0
2016-08-01 12 1
2017-03-01 12 1
2017-05-01 12 1
2014-10-01 13 1
2015-04-01 13 1
2015-07-01 13 0
2015-11-01 14 0
2016-01-01 14 0
...
我的目标是创建一个“平衡”数据,即每个ID应该出现在10个日期中的每一个。最初未发生的观测值的变量“Status”应标记为N / A.换句话说,结果应如下所示:
Date ID Status
2014-10-01 12 1
2015-04-01 12 1
2015-07-01 12 1
2015-09-01 12 1
2015-11-01 12 0
2016-01-01 12 0
2016-05-01 12 0
2016-08-01 12 1
2017-03-01 12 1
2017-05-01 12 1
2014-10-01 13 1
2015-04-01 13 1
2015-07-01 13 N/A
2015-09-01 13 N/A
2015-11-01 13 N/A
2016-01-01 13 N/A
2016-05-01 13 N/A
2016-08-01 13 N/A
2017-03-01 13 N/A
2017-05-01 13 N/A
2014-10-01 14 N/A
2015-04-01 14 N/A
2015-07-01 14 N/A
2015-09-01 14 N/A
2015-11-01 14 0
2016-01-01 14 0
2016-05-01 14 N/A
2016-08-01 14 N/A
2017-03-01 14 N/A
2017-05-01 14 N/A
...
感谢您的帮助!
答案 0 :(得分:1)
以下是使用tidyverse的方法:
library(tidyverse)
df %>%
group_by(ID) %>%
expand(Date) %>% #in each id expand the dates
left_join(df) -> df1 #join the original data frame and save to object df1
或保存到原始对象(感谢Renu的评论):
df %<>%
group_by(ID) %>%
expand(Date) %>% #in each id expand the dates
left_join(df)
相当于:
df %>%
group_by(ID) %>%
expand(Date) %>% #in each id expand the dates
left_join(df) -> df
结果:
ID Date Status
1 12 2014-10-01 1
2 12 2015-04-01 1
3 12 2015-07-01 1
4 12 2015-09-01 1
5 12 2015-11-01 0
6 12 2016-01-01 0
7 12 2016-05-01 0
8 12 2016-08-01 1
9 12 2017-03-01 1
10 12 2017-05-01 1
11 13 2014-10-01 1
12 13 2015-04-01 1
13 13 2015-07-01 0
14 13 2015-09-01 NA
15 13 2015-11-01 NA
16 13 2016-01-01 NA
17 13 2016-05-01 NA
18 13 2016-08-01 NA
19 13 2017-03-01 NA
20 13 2017-05-01 NA
21 14 2014-10-01 NA
22 14 2015-04-01 NA
23 14 2015-07-01 NA
24 14 2015-09-01 NA
25 14 2015-11-01 0
26 14 2016-01-01 0
27 14 2016-05-01 NA
28 14 2016-08-01 NA
29 14 2017-03-01 NA
30 14 2017-05-01 NA
数据:
> dput(df)
structure(list(Date = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 1L, 2L, 3L, 5L, 6L), .Label = c("2014-10-01", "2015-04-01",
"2015-07-01", "2015-09-01", "2015-11-01", "2016-01-01", "2016-05-01",
"2016-08-01", "2017-03-01", "2017-05-01"), class = "factor"),
ID = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
13L, 13L, 13L, 14L, 14L), Status = c(1L, 1L, 1L, 1L, 0L,
0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L)), .Names = c("Date",
"ID", "Status"), class = "data.frame", row.names = c(NA, -15L
))
答案 1 :(得分:0)
以下对我有用:
void main() {
*((volatile unsigned char *)(0x27)) = 128;
volatile unsigned char * x = (unsigned char *) 301;
volatile unsigned char * y = (unsigned char *) 302;
volatile unsigned char * z = (unsigned char *) 303;
start:
*((volatile unsigned char *)(0x28)) = 0;
for (*x = 0; *x < 255; (*x)++) {
for (*y = 0; *y < 255; (*y)++) {
for (*z = 0; *z < 255; (*z)++) {
}
}
}
*((volatile unsigned char *)(0x28)) = 128;
for (*x = 0; *x < 255; (*x)++) {
for (*y = 0; *y < 255; (*y)++) {
for (*z = 0; *z < 255; (*z)++) {
}
}
}
goto start;
}