我正在尝试编写一个ddply汇总语句,该语句适用于POSIXct的向量时间。对于每个user.nm我只想获得与其名称相关联的最大和最小时间戳。数据看起来像这样:
test.data=structure(list(user.nm = structure(c(1L, 1L, 2L, 3L, 4L, 4L), .Label = c("a",
"b", "c", "d"), class = "factor"), ip.addr.txt = structure(c(1L,
2L, 3L, 4L, 5L, 5L), .Label = c("a", "b", "c", "d", "e"), class = "factor"),
login.dt = structure(c(4L, 3L, 5L, 1L, 2L, 6L), .Label = c("11/20/2013",
"12/26/2013", "3/11/2013", "6/25/2013", "6/27/2013", "7/15/2013"
), class = "factor"), login.time = structure(c(3L, 4L, 6L,
1L, 2L, 5L), .Label = c("10:16:17", "11:07:27", "13:22:32",
"13:55:05", "9:23:33", "9:49:23"), class = "factor"), login.sessn.ts = structure(c(1372180920,
1363024500, 1372340940, 1384960560, 1388074020, 1373894580
), class = c("POSIXct", "POSIXt"), tzone = ""), month = structure(c(3L,
4L, 3L, 5L, 1L, 2L), .Label = c("Dec-2013", "Jul-2013", "Jun-2013",
"Mar-2013", "Nov-2013"), class = "factor"), quarter = c(2L,
1L, 2L, 4L, 4L, 3L), change.label = c(TRUE, TRUE, TRUE, TRUE,
TRUE, TRUE)), .Names = c("user.nm", "ip.addr.txt", "login.dt",
"login.time", "login.sessn.ts", "month", "quarter", "change.label"
), row.names = c(NA, -6L), class = "data.frame")
plyr语句如下所示:
user.changes=ddply(test.data, c("user.nm"), summarize,
change.count=sum(ip.label.txt),
max.change.time=max(login.sessn.ts),
min.change.time=min(login.sessn.ts))
我得到的错误是:
Error in attributes(out) <- attributes(col) :
'names' attribute [9] must be the same length as the vector [2]
我有一些问题解释了这个错误实际意味着什么,显然one person's solution涉及将POSIXct类转换为字符,这在我的情况下并不真正起作用。
是否有人可以阐明如何使这项工作?我也对其他方法持开放态度,我只是喜欢ddply语法的相对简单性。我将在不久的将来使用更多基于时间的数据,因此,我非常感谢任何人对如何使用其他基于R的工具处理此类聚合问题的见解。答案 0 :(得分:0)
我使用str
检查了您的数据,结果发现您的日期实际上是因素。您可以使用lubridate
:
library(lubridate)
test.data2 <- transform(test.data,lst = dmy_hm(login.sessn.ts))
ddply(test.data2, c("user.nm"), summarize,
change.count=sum(ip.addr.txt),
max.change.time=max(lst),
min.change.time=min(lst))
user.nm change.count max.change.time min.change.time
1 a 3 2013-11-03 13:55:00 2013-01-06 12:03:44
2 b 3 2013-01-06 08:35:32 2013-01-06 08:35:32
3 c 4 2013-01-11 10:16:00 2013-01-11 10:16:00
4 d 10 2046-11-24 13:24:29 2013-01-12 11:08:04