分析Sas或R中的时间戳数据

时间:2014-07-11 21:56:03

标签: r loops sas

我有一系列时间戳代表用户在网站上的活动。我想将这些时间戳区分为会话(定义为相隔不到一小时的时间戳),计算每个会话的长度以及会话之间的差距。

示例数据集如下所示:

enter image description here

有没有办法在sas或R中循环这一系列时间戳,以便我可以计算会话长度(例如:01JUL14中的23:00会话)并计算会话之间的差距(7月1日之间的时间间隔)和7月9日)

谢谢!

1 个答案:

答案 0 :(得分:1)

# reproducible input data
dta <- data.frame(time = as.POSIXlt(c("2006-10-21 18:47:22",
                                      "2006-10-21 18:57:58",
                                      "2006-10-21 19:59:05",
                                      "2006-10-21 20:05:05",
                                      "2006-10-21 20:06:05",
                                      "2006-10-21 20:07:05",
                                      "2006-10-21 22:04:05",
                                      "2006-10-21 22:05:05")))
# see which timestamps are the start/stop of a session. 
# Hope that meets your definition of (inactivity less than) 1 hr for one session
dta$s.start <- c(TRUE, diff(dta$time) > 60)  # TRUE = start of new session, 60 min as max duration of a session
dta$s.stop  <- c(dta$s.start[2:length(dta$s.start)], TRUE) # TRUE = stop of this session

# indices of the timestamps that mar a session
sessions <- data.frame(
  s.1 = which(dta$s.start),  # starts
  s.2 = which(dta$s.stop))   # stops

# duration and gaps
(durations <- dta$time[sessions$s.2] - dta$time[sessions$s.1])
(gaps <- dta$time[sessions$s.1[2:length(sessions$s.1)]] - dta$time[sessions$s.2[1:length(sessions$s.2)-1]])