我需要计算其中一列中每组开头和结尾的中位数2分钟。
以下是示例数据:
Time <- c("2015-08-21T10:00:51", "2015-08-21T10:02:51", "2015-08-21T10:04:51", "2015-08-21T10:06:51",
"2015-08-21T10:08:51", "2015-08-21T10:10:51","2015-08-21T10:12:51", "2015-08-21T10:14:51",
"2015-08-21T10:16:51", "2015-08-21T10:18:51", "2015-08-21T10:20:51", "2015-08-21T10:22:51")
x <- c(38.855, 38.664, 40.386, 40.386, 40.195, 40.386, 40.386, 40.195, 40.386, 38.855, 38.664, 40.386)
y <- c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b")
data <- data.frame(Time,x,y)
data$Time <- as.POSIXct(data$Time, format = "%Y-%m-%dT%H:%M:%S")
因此,在这种情况下,列x的中位数为2分钟开始时的时间 此计算的结果最好是作为新的data.frame获取: 这次分组必须在时间列的基础上完成,而不是基于行计数(在原始数据中,每个组的行都不同) 感谢所有的想法和帮助!"2015-08-21T10:00:51"
,"2015-08-21T10:02:51"
因此 x = 38.855,38.664 中位数< / strong> = 38.7595)并结束("2015-08-21T10:08:51"
,"2015-08-21T10:10:51"
因此 x = 40.195,40.386 中位数 = 40.2905)等级为 a ,进一步为 b ("2015-08-21T10:10:51"
,"2015-08-21T10:12:51"
,因此 x = 40.386,40.195 中位数 = 40.2905)并结束("2015-08-21T10:20:51"
,"2015-08-21T10:22:51"
因此 x = 38.664,40.386 中位数 = 39.525)... < / p>
y median1 median2
a 38.7595 40.2905
b 40.2905 39.525
答案 0 :(得分:1)
一种方法(如果我帮你做对):
as.data.frame(as.list(
aggregate(x~y, data[order(data$Time), ], function(x)
c(med1=mean(head(x, 2)), med2=mean(tail(x, 2)))
)
))
# y x.med1 x.med2
# 1 a 38.7595 40.2905
# 2 b 40.2905 39.5250
我不明白为什么必须在data$Time
上进行分组。在这里,它在data$y
上。如果数据集已按时间排序,请将data[order(data$Time), ]
替换为data
。
对于多个变量,请尝试
library(dplyr)
data %>%
arrange(Time) %>%
group_by(y) %>%
select(-Time) %>%
filter(row_number() %in% c(1, 2, n()-1, n())) %>%
mutate(f = as.factor(rep(c("head", "tail"), each = 2))) %>%
group_by(f, add = TRUE) %>%
summarise_each(funs(median))