数据帧df1
总结了在时间(ID
)中对个人(Date
)的检测。简短示例:
df1<- data.frame(ID= c(1,2,1,2,1,2,1,2,1,2),
Date= ymd(c("2016-08-21","2016-08-24","2016-08-23","2016-08-29","2016-08-27","2016-09-02","2016-09-01","2016-09-09","2016-09-01","2016-09-10")))
df1
ID Date
1 1 2016-08-21
2 2 2016-08-24
3 1 2016-08-23
4 2 2016-08-29
5 1 2016-08-27
6 2 2016-09-02
7 1 2016-09-01
8 2 2016-09-09
9 1 2016-09-01
10 2 2016-09-10
我想总结Number of days since the first detection of the individual
(Ndays
)和Number of days that the individual has been detected since the first time it was detected
(Ndifdays
)。
此外,我想在此汇总表中包含一个名为Prop
的变量,该变量将Ndifdays
划分为Ndays
之间。
我希望的汇总表是这样:
> Result
ID Ndays Ndifdays Prop
1 1 11 4 0.360 # Between 21st Aug and 01st Sept there is 11 days.
2 2 17 5 0.294 # Between 24th Aug and 10st Sept there is 17 days.
有人知道怎么做吗?
答案 0 :(得分:1)
您可以在dplyr
library(dplyr)
df1 %>%
group_by(ID) %>%
summarise(Ndays = as.integer(max(Date) - min(Date)),
Ndifdays = n_distinct(Date),
Prop = Ndifdays/Ndays)
# ID Ndays Ndifdays Prop
# <dbl> <int> <int> <dbl>
#1 1 11 4 0.364
#2 2 17 5 0.294
此版本的data.table
是
library(data.table)
df12 <- setDT(df1)[, .(Ndays = as.integer(max(Date) - min(Date)),
Ndifdays = uniqueN(Date)), by = ID]
df12$Prop <- df12$Ndifdays/df12$Ndays
并以aggregate
df12 <- aggregate(Date~ID, df1, function(x) c(max(x) - min(x), length(unique(x))))
df12$Prop <- df1$Ndifdays/df1$Ndays
答案 1 :(得分:0)
按“ ID”分组后,获取“日期”的diff
或range
以创建“ Ndays”,然后使用n_distinct
获得唯一的“日期”号,除以非重复数除以Ndays即可得出“道具”
library(dplyr)
df1 %>%
group_by(ID) %>%
summarise(Ndays = as.integer(diff(range(Date))),
Ndifdays = n_distinct(Date),
Prop = Ndifdays/Ndays)
# A tibble: 2 x 4
# ID Ndays Ndifdays Prop
# <dbl> <int> <int> <dbl>
#1 1 11 4 0.364
#2 2 17 5 0.294