我的表格类似于:
dt <- data.table(id=c("123","123","234","234","345","345","345","456","456"),
treatment=c("control","control","variable","variable","control","control","control","control","control"),
cohort=c("2015-08-10","2015-08-10","2015-08-10","2015-08-10","2015-08-17","2015-08-17","2015-08-17","2015-08-17","2015-08-17"),
visit_date=c("2015-08-10", "2015-08-11","2015-08-12", "2015-08-18","2015-08-19","2015-08-31","2015-09-01","2015-08-19","2015-08-27"),
visit_week=c("2015-08-10", "2015-08-10","2015-08-10", "2015-08-17","2015-08-17","2015-08-31","2015-08-31","2015-08-17","2015-08-24"))
> dt
id treatment cohort visit_date visit_week
1: 123 control 2015-08-10 2015-08-10 2015-08-10
2: 123 control 2015-08-10 2015-08-11 2015-08-10
3: 234 variable 2015-08-10 2015-08-12 2015-08-10
4: 234 variable 2015-08-10 2015-08-18 2015-08-17
5: 345 control 2015-08-17 2015-08-19 2015-08-17
6: 345 control 2015-08-17 2015-08-31 2015-08-31
7: 345 control 2015-08-17 2015-09-01 2015-08-31
8: 456 control 2015-08-17 2015-08-19 2015-08-17
9: 456 control 2015-08-17 2015-08-27 2015-08-24
我试图输出这样的内容:
cohort treatment visit_week_1 visit_week_2 visit_week_3
1: 2015-08-10 control 1 0 0
2: 2015-08-10 variable 1 1 0
3: 2015-08-17 control 2 1 1
我尝试使用dcast,但我的命令有问题,因为计数已关闭:
> dcast(dt, cohort+treatment ~ paste0("visit_week_", dt[, seq_len(.N), by=id]$V1), value.var="visit_week", function(x) length(unique(x)))
cohort treatment visit_week_1 visit_week_2 visit_week_3
1 2015-08-10 control 1 1 0
2 2015-08-10 variable 1 1 0
3 2015-08-17 control 1 2 1
补充说明:每个访问周需要与每个群组相关。因此,visit_week
1:3对于群组2015-08-10
将是&#34; 2015-08-10&#34;,&#34; 2015-08-17&#34;,&#34; 2015- 08-24&#34 ;.对于队列2015-08-17
周1:3将是&#34; 2015-08-17&#34;,&#34; 2015-08-24&#34;,&#34; 2015-08-31&#34 ;
答案 0 :(得分:4)
您可以使用dplyr
和tidyr
:
library(dplyr)
library(tidyr)
dt %>% group_by(cohort, treatment, visit_week) %>%
summarise(visits = n()) %>%
mutate(week = paste0("visit_week_", as.numeric(as.factor(visit_week)))) %>%
dplyr::select(-visit_week) %>%
spread(week, visits, fill = 0)
Source: local data table [3 x 5]
Groups:
cohort treatment visit_week_1 visit_week_2 visit_week_3
(chr) (chr) (dbl) (dbl) (dbl)
1 2015-08-10 control 2 0 0
2 2015-08-10 variable 1 1 0
3 2015-08-17 control 2 1 2