我有这样的数据集:
> dput(data)
structure(list(Run = c("Dur 2", "Dur 3", "Dur 4", "Dur 5", "Dur 7",
"Dur 8", "Dur 9"), reference = c("00h 00m 32s", "00h 00m 31s",
"00h 05m 46s", "00h 03m 51s", "00h 06m 49s", "00h 06m 47s", "00h 08m 56s"
), test30 = c("00h 00m 44s", "00h 00m 41s", "00h 21m 54s", "00h 13m 37s",
"00h 28m 48s", "00h 22m 54s", "10h 02m 12s"), test31 = c("00h 00m 39s",
"00h 00m 45s", "00h 40m 10s", "00h 23m 07s", "00h 35m 23s", "00h 47m 42s",
"25h 37m 05s"), test32 = c("00h 01m 05s", "00h 01m 13s", "00h 55m 02s",
"00h 28m 54s", "01h 03m 17s", "01h 02m 08s", "39h 04m 39s")), .Names = c("Run",
"reference", "test30", "test31", "test32"), class = "data.frame", row.names = c(NA,
-7L))
我试着把它变成可绘制的格式,如下:
library(reshape2)
library(scales)
# melt the data and convert the time strings to POSIXct format
data_melted <- melt(data, id.var = "Run")
data_melted$value <- as.POSIXct(data_melted$value, format = "%Hh %Mm %Ss")
我在[{1}}的最后持续时间内得到NA
s,大概是因为POSOXct在24小时的意义上期待实际的HMS数据。
处理这样的记录数据的建议方法是什么?这些数据在Dur9
之后不会延续几天?
我是否需要为此类实例手动检查它并创建一个表示日期的新字符串(这似乎要求我创建一个任意的开始日并在H > 24
时增加一天)?或者是否有更适合严格时间数据的包,假设所有时间数据都是根据实际时间戳记录的?
非常感谢!
答案 0 :(得分:2)
您可以使用colsplit
包中的plyr
创建小时,分钟和秒的列,然后使用创建可添加到日期的difftime
对象
library(plyr)
# note gsub('s','',mdd[['value']]) removes trailing s from each value
# we then split on `[hm]` (ie. h or m)` -- this returns a data.frame with
# 3 integer columns
times <- colsplit(gsub('s','',mdd[['value']]), '[hm]', names = c('h','m','s'))
seconds <- as.difftime(with(times, h*60*60 + m *60 + s), format = '%X', units = 'secs')
seconds
Time differences in secs
[1] 32 31 346 231 409 407 536 44 41 1314 817 1728 1374 36132 39 45
[17] 2410 1387 2123 2862 92225 65 73 3302 1734 3797 3728 140679
您无需使用Map
和Reduce
Reduce('+',Map(as.difftime, times, units = c('hours','mins','secs')))