这是我的问题。我有一套一年的气象数据,每小时观察一次。我想把它分成几天。所以我不想对它进行采样,而是每24行拆分一次。所以我想要一个具有天数长度的新数据集。这是两天数据集的示例
Year Mounth Day date temp humidity wind Hour
1 2017 1 1 01/01/17 -2 74.0 7.2 0
2 2017 1 1 01/01/17 -2 74.0 7.2 1
3 2017 1 1 01/01/17 -2 74.0 7.2 2
4 2017 1 1 01/01/17 -3 73.8 4.6 3
5 2017 1 1 01/01/17 -4 79.6 3.6 4
6 2017 1 1 01/01/17 -4 79.6 4.6 5
7 2017 1 1 01/01/17 -5 85.8 4.1 6
8 2017 1 1 01/01/17 -6 85.7 2.6 7
9 2017 1 1 01/01/17 -4 79.6 3.6 8
10 2017 1 1 01/01/17 -2 68.6 3.6 9
11 2017 1 1 01/01/17 -1 68.8 7.2 10
12 2017 1 1 01/01/17 0 63.9 7.2 11
13 2017 1 1 01/01/17 1 64.2 6.2 12
14 2017 1 1 01/01/17 2 59.7 4.6 13
15 2017 1 1 01/01/17 1 64.2 4.6 14
16 2017 1 1 01/01/17 -1 68.8 3.6 15
17 2017 1 1 01/01/17 -2 68.6 3.6 16
18 2017 1 1 01/01/17 -1 74.2 4.1 17
19 2017 1 1 01/01/17 -2 79.9 5.7 18
20 2017 1 1 01/01/17 -3 86.0 5.7 19
21 2017 1 1 01/01/17 -2 79.9 7.2 20
22 2017 1 1 01/01/17 -3 86.0 8.2 21
23 2017 1 1 01/01/17 -2 79.9 7.2 22
24 2017 1 1 01/01/17 -1 80.0 7.2 23
25 2017 1 2 02/01/17 -1 86.3 7.2 0
26 2017 1 2 02/01/17 -1 86.3 7.7 1
27 2017 1 2 02/01/17 -1 86.3 7.2 2
28 2017 1 2 02/01/17 -1 86.3 7.7 3
29 2017 1 2 02/01/17 -1 92.9 7.7 4
30 2017 1 2 02/01/17 -1 92.9 7.2 5
31 2017 1 2 02/01/17 -1 92.9 7.7 6
32 2017 1 2 02/01/17 -1 100.0 7.2 7
33 2017 1 2 02/01/17 -1 100.0 8.2 8
34 2017 1 2 02/01/17 -1 100.0 7.7 9
35 2017 1 2 02/01/17 0 93.0 6.7 10
36 2017 1 2 02/01/17 0 100.0 5.7 11
37 2017 1 2 02/01/17 1 93.0 6.2 12
38 2017 1 2 02/01/17 1 93.0 4.1 13
39 2017 1 2 02/01/17 1 93.0 4.1 14
40 2017 1 2 02/01/17 1 93.0 5.1 15
41 2017 1 2 02/01/17 1 86.5 5.1 16
42 2017 1 2 02/01/17 1 86.5 4.6 17
43 2017 1 2 02/01/17 -1 92.9 3.6 18
44 2017 1 2 02/01/17 -3 100.0 3.6 19
45 2017 1 2 02/01/17 -2 92.8 4.1 20
46 2017 1 2 02/01/17 -2 92.8 7.7 21
47 2017 1 2 02/01/17 -2 92.8 6.7 22
48 2017 1 2 02/01/17 -1 92.9 7.2 23
这是我的代码不起作用......
datasplit <- lapply(split(nrow(data), ceiling((1:nrow(data))/24)), function(i) data[i, ])
我期望的结果是一个维度为=(2,24,8)的新数据集:
2是天数(本例)
24是小时数
8是列数
提前致谢...
答案 0 :(得分:0)
lapply
系列功能在这里可能有点过分。我认为你所需要的只是split
。只需确保您要拆分的变量属于factor
类型。
str(data)
'data.frame': 48 obs. of 8 variables:
$ Year : int 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
$ Mounth : int 1 1 1 1 1 1 1 1 1 1 ...
$ Day : int 1 1 1 1 1 1 1 1 1 1 ...
$ date : Factor w/ 2 levels "01/01/17","02/01/17": 1 1 1 1 1 1 1 1 1 1 ...
$ temp : int -2 -2 -2 -3 -4 -4 -5 -6 -4 -2 ...
$ humidity: num 74 74 74 73.8 79.6 79.6 85.8 85.7 79.6 68.6 ...
$ wind : num 7.2 7.2 7.2 4.6 3.6 4.6 4.1 2.6 3.6 3.6 ...
$ Hour : int 0 1 2 3 4 5 6 7 8 9 ...
根据我的默认导入,Day
似乎是int
类型,或整数不是因素。如果是这种情况,则以下情况应该有效。
data$Day <- as.factor(data$Day)
datasplit <- split(data, data$Day)
其中输出以下内容:
str(datasplit)
List of 2
$ 1:'data.frame': 24 obs. of 8 variables:
..$ Year : int [1:24] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
..$ Mounth : int [1:24] 1 1 1 1 1 1 1 1 1 1 ...
..$ Day : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
..$ date : Factor w/ 2 levels "01/01/17","02/01/17": 1 1 1 1 1 1 1 1 1 1 ...
..$ temp : int [1:24] -2 -2 -2 -3 -4 -4 -5 -6 -4 -2 ...
..$ humidity: num [1:24] 74 74 74 73.8 79.6 79.6 85.8 85.7 79.6 68.6 ...
..$ wind : num [1:24] 7.2 7.2 7.2 4.6 3.6 4.6 4.1 2.6 3.6 3.6 ...
..$ Hour : int [1:24] 0 1 2 3 4 5 6 7 8 9 ...
$ 2:'data.frame': 24 obs. of 8 variables:
..$ Year : int [1:24] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
..$ Mounth : int [1:24] 1 1 1 1 1 1 1 1 1 1 ...
..$ Day : Factor w/ 2 levels "1","2": 2 2 2 2 2 2 2 2 2 2 ...
..$ date : Factor w/ 2 levels "01/01/17","02/01/17": 2 2 2 2 2 2 2 2 2 2 ...
..$ temp : int [1:24] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
..$ humidity: num [1:24] 86.3 86.3 86.3 86.3 92.9 92.9 92.9 100 100 100 ...
..$ wind : num [1:24] 7.2 7.7 7.2 7.7 7.7 7.2 7.7 7.2 8.2 7.7 ...
..$ Hour : int [1:24] 0 1 2 3 4 5 6 7 8 9 ...