从数据表中每个ID每天过滤数据,并计算R中的行数

时间:2014-11-26 12:35:31

标签: r

我有一个data table,我需要计算每个ID每天的行数,但不知怎的,我无法做到。 我正在使用的代码是

Res=foreach(j=1:length(uniquetimestampnumber)) %dopar% 
   { ANALYSIS[,length(TIMESTAMP),by=list(ID)]}

data table看起来像这样

    TIMESTAMP    ID
8/5/2014 17:45  28808
8/5/2014 18:00  28808
8/5/2014 18:15  69821
8/5/2014 18:30  69821
8/5/2014 18:45  69821
8/5/2014 19:00  56247
8/5/2014 19:15  56247
8/5/2014 19:30  56247
8/5/2014 19:45  56247
8/5/2014 20:00  56247
8/5/2014 20:00  28808
8/5/2014 20:15  28808
8/5/2014 20:30  28808
8/5/2014 20:45  28808
8/5/2014 21:00  69821
8/5/2014 21:15  69821

1 个答案:

答案 0 :(得分:0)

尝试

library(data.table)
setDT(df)[, .N,list(Day=as.Date(TIMESTAMP, '%d/%m/%Y'), ID)]
#          Day    ID N
#1: 2014-05-08 28808 6
#2: 2014-05-08 69821 5
#3: 2014-05-08 56247 5

或者可能是

setDT(df)[,Day:=as.Date(TIMESTAMP, '%d/%m/%Y')]
setkey(df, Day, ID)[, .N, by=list(Day, ID)]

数据

df <-  structure(list(TIMESTAMP = c("8/5/2014 17:45", "8/5/2014 18:00", 
"8/5/2014 18:15", "8/5/2014 18:30", "8/5/2014 18:45", "8/5/2014 19:00", 
"8/5/2014 19:15", "8/5/2014 19:30", "8/5/2014 19:45", "8/5/2014 20:00", 
"8/5/2014 20:00", "8/5/2014 20:15", "8/5/2014 20:30", "8/5/2014 20:45", 
"8/5/2014 21:00", "8/5/2014 21:15"), ID = c(28808L, 28808L, 69821L, 
69821L, 69821L, 56247L, 56247L, 56247L, 56247L, 56247L, 28808L, 
28808L, 28808L, 28808L, 69821L, 69821L)), .Names = c("TIMESTAMP", 
"ID"), class = "data.frame", row.names = c(NA, -16L))