以下是我的数据集str
。
'data.frame': 9995 obs. of 10 variables:
$ Count : int 1 2 3 4 5 6 7 8 9 10 ...
$ Gates : Factor w/ 5 levels "B6","B9","I1",..: 3 3 4 4 3 4 4 4 4 4 ...
$ Entry_Date : Date, format: "0006-10-20" "0006-10-20" "0006-10-20" ...
$ Entry_Time : Factor w/ 950 levels "00:01:00","00:04:00",..: 347 366 450 550 563 700 701 350 460 506 ...
$ Exit_Date : Date, format: "0006-10-20" "0006-10-20" "0006-10-20" ...
$ Exit_Time : Factor w/ 1012 levels "00:00:00","00:01:00",..: 618 556 637 694 770 936 948 590 640 655 ...
$ Type_of_entry : Factor w/ 3 levels "Manual","Pass",..: 3 3 3 3 3 3 3 3 3 3 ...
$ weekday : Factor w/ 7 levels "Friday","Monday",..: 2 2 2 2 2 2 2 6 6 6 ...
$ Ticket.Loss: Factor w/ 2 levels "N","Y": 1 1 1 1 1 2 2 1 1 1 ...
$ Duration : Factor w/ 501 levels "00:01:00","00:02:00",..: 223 142 139 96 159 188 199 192 132 101 ...
我正在使用以下功能:
W <- aggregate(Duration ~ Gates, data=parking, FUN=mean)
但是低于错误:
警告消息:1:在mean.default(X [[i]],...)中:参数不是 数字或逻辑:返回NA
答案 0 :(得分:2)
Duration
是字符串的一个因子,看起来像持续时间,“00:01:00”等。
chron
包适用于此类字符串。
library(chron)
aggregate(chron(times=Duration) ~ Gates, data=parking, FUN=mean)
这将给出Gates
中每个级别的平均时间。
答案 1 :(得分:0)
如果OP的数据集是实时时间列,我们可以使用as.POSIXct
将其转换为&#39; DateTime&#39;类
parking$Duration <- as.POSIXct(parking$Duration, format = "%H:%M:%S")
transform(aggregate(Duration ~ Gates, data = parking, FUN = mean),
Duration = sub("\\S+\\s+", "", Duration))
# Gates Duration
#1 B6 11:08:34
#2 B9 11:07:31
#3 I1 11:07:10
注意:没有使用外部包。
set.seed(24)
parking <- data.frame(Gates = sample(c("B6", "B9", "I1"), 20, replace=TRUE),
Duration = format(seq(Sys.time(), length.out=20, by = "1 min") , "%H:%M:%S"))