我的数据格式如下:
Out Reg. Task Date Task State Name Task State Time
A6ERE 01-03-2014 MANUAL 23:09:48
A6ERE 01-03-2014 MANUAL 23:10:05
A6ERE 01-03-2014 ASSIGN 23:24:44
A6ERE 01-03-2014 CONFRM 23:25:15
A6ERE 01-03-2014 ARR 23:45:07
A6ERE 01-03-2014 STARTED 00:20:00
A6ERE 01-03-2014 FINISH 00:39:00
A6EED 01-03-2014 FREE 22:42:28
A6EED 01-03-2014 MANUAL 23:37:37
A6EED 01-03-2014 ASSIGN 23:37:41
A6EED 01-03-2014 CONFRM 23:37:58
A6EED 01-03-2014 ARR 00:34:26
A6EED 01-03-2014 STARTED 01:04:00
我想总结如下:
Out Reg.Task Date ARR ASSIGN CONFRM FINISH FREE MANUAL STARTED
A6ERE 01-03-2014 23:45:07 23:24:44 23:25:15 00:39:00 23:10:05 00:20:00
A6EED 01-03-2014 00:34:26 23:37:41 23:37:58 01:41:00 22:42:28 23:37:37 01:04:00
我在ex中使用了pivot,在R中使用了dcast(Creating a pivot table)。它们都输出时间戳的数量而不是时间戳。
MS-Access中的交叉表能够提供上述所需的输出。
无论如何我能在R中实现同样的目标吗?请指教。
谢谢。
答案 0 :(得分:1)
看起来您想要使用重复值的最后一个值:
DF <- read.table(text="A6ERE 01-03-2014 MANUAL 23:09:48
A6ERE 01-03-2014 MANUAL 23:10:05
A6ERE 01-03-2014 ASSIGN 23:24:44
A6ERE 01-03-2014 CONFRM 23:25:15
A6ERE 01-03-2014 ARR 23:45:07
A6ERE 01-03-2014 STARTED 00:20:00
A6ERE 01-03-2014 FINISH 00:39:00
A6EED 01-03-2014 FREE 22:42:28
A6EED 01-03-2014 MANUAL 23:37:37
A6EED 01-03-2014 ASSIGN 23:37:41
A6EED 01-03-2014 CONFRM 23:37:58
A6EED 01-03-2014 ARR 00:34:26
A6EED 01-03-2014 STARTED 01:04:00")
dcast(DF, V1+V2~V3, value.var="V4",
fun.aggregate=function(x) tail(as.character(x), 1), fill="")
# V1 V2 ARR ASSIGN CONFRM FINISH FREE MANUAL STARTED
#1 A6EED 01-03-2014 00:34:26 23:37:41 23:37:58 22:42:28 23:37:37 01:04:00
#2 A6ERE 01-03-2014 23:45:07 23:24:44 23:25:15 00:39:00 23:10:05 00:20:00