我有3个测站(实际上更多)的“ mydata_hourly”,并且它们一年中的小时温度值很高。这使我一年可以进行8760次每小时测量。现在,我希望具有相同的结构,但(365)24h平均意味着'mydata_daily'。
我尝试了一些for循环操作,但没有成功。我听说过有关聚合函数的内容。我发现有时间戳的东西,不幸的是我没有。
。
my_data_hourly <- structure(c(8.29, 7.96, 8.14, 7.27, 7.37, 7.3, 7.23, 7.53,
7.98, 10.2, 12.39, 14.34, 14.87, 14.39, 12.54, 11.84, 10.3, 10.62,
10.65, 10.56, 10.43, 10.35, 9.85, 9.12, 8.95, 8.82, 8.92, 9.33,
9.44, 9.3, 9.15, 9.37, 9.54, 10.24, 12.13, 12.43, 12.65, 13,
13.18, 13.58, 13.64, 13.75, 13.85, 13.94, 13.79, 13.84, 13.94,
14.26, 24.93, 24.64, 23.67, 21.46, 21.33, 20.83, 21.12, 21.1,
23.75, 25.39, 30.72, 30.71, 30.81, 30.92, 32.61, 32.37, 32.49,
30.68, 30.23, 30.45, 28.1, 26.9, 25.09, 25.07, 24.59, 24.22,
23.05, 22.21, 22.07, 21.6, 21.24, 21.22, 21.85, 24.87, 28.85,
29.42, 30.82, 30.97, 31.32, 30.81, 30.83, 29.9, 30.01, 30.31,
30, 27.91, 25.78, 25.88, 8.78, 8.47, 8.49, 7.65, 8.63, 9.02,
9.02, 8.11, 7.63, 9.19, 11.25, 12.24, 13.62, 12.09, 10.6, 11.1,
10.16, 10.44, 9.58, 10.04, 10.01, 10.23, 9.51, 9.2, 9.34, 9.6,
9.4, 9.45, 9.36, 9.26, 9.3, 9.46, 9.58, 9.89, 10.6, 11.04, 12.1,
12.61, 13.12, 13.47, 13.55, 13.51, 13.63, 13.84, 13.93, 14.17,
13.97, 13.86), .Dim = c(48L, 3L), .Dimnames = list(NULL, c("station1",
"station2", "station3")))
。
hourly_measure Station1 Station2 Station3
[1,] 8.29 24.93 8.78
[2,] 7.96 24.64 8.47
[3,] 8.14 23.67 8.49
[4,] 7.27 21.46 7.65
[5,] 7.37 21.33 8.63
[6,] 7.30 20.83 9.02
[7,] 7.23 21.12 9.02
[8,] 7.53 21.10 8.11
[9,] 7.98 23.75 7.63
[10,] 10.20 25.39 9.19
[11,] 12.39 30.72 11.25
[12,] 14.34 30.71 12.24
[13,] 14.87 30.81 13.62
[14,] 14.39 30.92 12.09
[15,] 12.54 32.61 10.60
[16,] 11.84 32.37 11.10
[17,] 10.30 32.49 10.16
[18,] 10.62 30.68 10.44
[19,] 10.65 30.23 9.58
[20,] 10.56 30.45 10.04
[21,] 10.43 28.10 10.01
[22,] 10.35 26.90 10.23
[23,] 9.85 25.09 9.51
[24,] 9.12 25.07 9.20
[25,] 8.95 24.59 9.34
[26,] 8.82 24.22 9.60
[27,] 8.92 23.05 9.40
[28,] 9.33 22.21 9.45
[29,] 9.44 22.07 9.36
[30,] 9.30 21.60 9.26
[31,] 9.15 21.24 9.30
[32,] 9.37 21.22 9.46
[33,] 9.54 21.85 9.58
[34,] 10.24 24.87 9.89
[35,] 12.13 28.85 10.60
[36,] 12.43 29.42 11.04
[37,] 12.65 30.82 12.10
[38,] 13.00 30.97 12.61
[39,] 13.18 31.32 13.12
[40,] 13.58 30.81 13.47
[41,] 13.64 30.83 13.55
[42,] 13.75 29.90 13.51
[43,] 13.85 30.01 13.63
[44,] 13.94 30.31 13.84
[45,] 13.79 30.00 13.93
[46,] 13.84 27.91 14.17
[47,] 13.94 25.78 13.97
[48,] 14.26 25.88 13.86
因此,从理论上讲,我希望在my_data_daily [1,1]中包含mydata_hourly [1:24,1] 和mydata_daily [2,1]中的mydata_hourly [25:48,1]
答案 0 :(得分:1)
一种dplyr
可能是:
df %>%
group_by(Period = gl(n()/24, 24)) %>%
summarise_at(-1, mean)
Period Station1 Station2 Station3
<fct> <dbl> <dbl> <dbl>
1 1 10.1 26.9 9.79
2 2 11.7 25.4 11.6
答案 1 :(得分:1)
这些是时间序列,可能最好使用时间序列表示形式,这将有助于绘图和其他时间序列处理。
I)假设您的数据是末尾“注释”中可重复显示的矩阵m
。将其转换为频率为24的ts
时间序列,然后将其汇总,如图所示。不使用任何软件包。
tt <- ts(m, frequency = 24)
aggregate(tt, 1, mean)
给予:
Time Series:
Start = 1
End = 2
Frequency = 1
Station1 Station2 Station3
1 10.06333 26.89042 9.794167
2 11.71000 25.40542 11.585000
2)zooreg :一种替代方法是使用zoo软件包创建zooreg对象。
library(zoo)
z <- zooreg(m, frequency = 24)
aggregate(z, as.integer, mean)
给予:
Station1 Station2 Station3
1 10.06333 26.89042 9.794167
2 11.71000 25.40542 11.585000
Lines <- "
Station1 Station2 Station3
8.29 24.93 8.78
7.96 24.64 8.47
8.14 23.67 8.49
7.27 21.46 7.65
7.37 21.33 8.63
7.30 20.83 9.02
7.23 21.12 9.02
7.53 21.10 8.11
7.98 23.75 7.63
10.20 25.39 9.19
12.39 30.72 11.25
14.34 30.71 12.24
14.87 30.81 13.62
14.39 30.92 12.09
12.54 32.61 10.60
11.84 32.37 11.10
10.30 32.49 10.16
10.62 30.68 10.44
10.65 30.23 9.58
10.56 30.45 10.04
10.43 28.10 10.01
10.35 26.90 10.23
9.85 25.09 9.51
9.12 25.07 9.20
8.95 24.59 9.34
8.82 24.22 9.60
8.92 23.05 9.40
9.33 22.21 9.45
9.44 22.07 9.36
9.30 21.60 9.26
9.15 21.24 9.30
9.37 21.22 9.46
9.54 21.85 9.58
10.24 24.87 9.89
12.13 28.85 10.60
12.43 29.42 11.04
12.65 30.82 12.10
13.00 0.97 12.61
13.18 31.32 13.12
13.58 30.81 13.47
13.64 30.83 13.55
13.75 29.90 13.51
13.85 30.01 13.63
13.94 30.31 13.84
13.79 30.00 13.93
13.84 27.91 14.17
13.94 25.78 13.97
14.26 25.88 13.86"
m <- as.matrix(read.table(text = Lines, header = TRUE))