目标 我想找出每个用户研究整个时间序列的总时长。
问题 根据下面的数据,我不知道如何获取每个用户的总学习时间?,尤其是因为每次位置更改或时间间隔较大时,都会开始一个新的序列。
一些澄清 通过携带电话的人来记录时间戳和位置的数据。手机偶尔会记录该人的时间和位置。因此,当位置不变时,可以将一个用户的数据视为时间序列。
技术上,实际数据集中的时间戳是unix时间戳,并且数据集非常大:
uid <- c(1,1,1,1,2,2,2,3,3,3,2,2,1,3,2,2,2,1)
timestamp <- c(1,4,5,7,3,8,15,1,2,3,300,305,600,150,410,413,415,800)
location <- c("Library1","library1","library2","library2","library1","library2","library2",
"library2","library2","library2","library4","library4","library4","library3",
"library2","library1","library1","library1")
df <- cbind(uid,timestamp,location)
# Desired Output
uid.output <- c(1,2,3)
study.duration <- c(5,14,2)
df.output <- cbind(uid.output,study.duration)
任何帮助将不胜感激!
答案 0 :(得分:0)
您可以尝试类似的操作:
$srcFile
输出:
library(data.table)
setDT(df)[, lri := rleid(location), uid][,
.(duration=sum(.SD[, timestamp[.N] - timestamp[1L], lri]$V1)), uid]