我正在处理80秒间隔的时间戳数据,如下所示:
Sub openLastModified()
Dim folderPath As String, tableName As String, latestTblName As String
Dim modifiedDate As Date
folderPath = "C:\test\"
tableName = Dir(folderPath & "*.cdr")
Do While tableName <> vbNullString
modifiedDate = FileDateTime(folderPath & tableName)
If latestModified < modifiedDate Then
latestModified = modifiedDate
latestTblName = tableName
End If
tableName = Dir()
Loop
OpenDocument folderPath & latestTblName
End Sub
由于提供了一些帮助,我设法对以下脚本进行了编码,该脚本应该创建一个双向表,以提供数据集中每天存在的每天每一小时(从0到23)的平均每小时活动,例如:
> head(dataraw)
GMT_DATE GMT_TIME ACTIVITY_Z
1: 6/19/2018 00:00:00 0
2: 6/19/2018 00:01:20 0
3: 6/19/2018 00:02:40 0
4: 6/19/2018 00:04:00 0
5: 6/19/2018 00:05:20 1
6: 6/19/2018 00:06:40 1
下面是我用于此目的的代码:
> head(act.byHour[1:3])
hour Activity on 6/19/2018 Activity on 6/20/2018
1 0 88 59
2 1 43 74
3 2 4297 4341
4 3 3708 3676
5 4 1728 2143
6 5 2528 3890
该代码似乎运行良好,但可悲的是,每天的最后一个小时,我收到> library(lubridate)
> data.byday <- split(dataraw,dataraw$GMT_DATE)
> act.byHour <- Reduce(function(...) merge(..., by = c('hour')), lapply(data.byday,function(df.day)
+ {
+ df.day$hour <- as.numeric(as.difftime(df.day$GMT_TIME,units="mins")) %/% 60
+ act.p.hour <- sapply(split(df.day,df.day$hour),function(df.hour){return(sum(df.hour$ACTIVITY_Z))})
+ hours <- as.integer(c(names(act.p.hour),seq(0,23)[!(0:23 %in% names(act.p.hour))]))
+ act.p.hour <- c(act.p.hour,rep(NA,24-length(act.p.hour)))
+ act.p.hour <- act.p.hour[order(hours)]
+ return(data.frame(hour=hours,activity=act.p.hour))
+ }))
There were 39 warnings (use warnings() to see them)
> names(act.byHour) <- c("hour",paste("Activity on",names(data.byday)))
的{{1}}:
NA
我希望有人能让我知道我的代码出了什么问题。我希望这是一个简单易懂的代码,但是我正在学习R,所以我觉得它很有挑战性。您可以找到完整的23
数据集here以实现可复制性。