我有几十个每天两次的数据,其结构如下
str(Raw.Data)
'data.frame': 709400 obs. of 7 variables:
$ V1: int 254 1 2 3 9 4 4 4 4 4 ...
$ V2: Factor w/ 448 levels "0","100","1000",..: 1 40 11 448 286 4 24 23 20 17 ...
$ V3: Factor w/ 18039 levels "","-1","-10",..: 99 15749 6714 18039 13326 4244 4221 12375 14708 16000 ...
$ V4: Factor w/ 3509 levels "","-1","-10",..: 3503 3034 3496 1 2176 3496 1219 2878 33 149 ...
$ V5: Factor w/ 1295 levels "","-1","-10",..: 1092 1273 1019 1 992 1295 1254 40 187 192 ...
$ V6: int NA 353 99999 NA 230 99999 163 202 238 262 ...
$ V7: int NA 99999 0 NA 40 99999 50 40 70 60 ...
在类似电子表格的格式中,第一天的数据如下:
254 0 1 JUN 1957 NA NA
1 94823 72520 40.50N 80.22W 353 99999
2 2000 2000 99999 13 99999 0
3 PIT ms NA NA
9 9780 353 234 105 230 40
4 10000 157 99999 99999 99999 99999
4 8500 1566 143 64 163 50
4 7000 3168 34 -133 202 40
4 5000 5815 -127 -266 238 70
4 4000 7483 -231 -270 262 60
4 3000 9517 -414 99999 258 150
4 2500 10726 -530 99999 260 170
4 2000 12128 -638 99999 271 230
254 12 1 JUN 1957 NA NA
1 94823 72520 40.50N 80.22W 353 99999
2 1000 1500 1690 15 7 0
3 PIT ms NA NA
9 9770 353 168 113 135 40
4 10000 153 99999 99999 99999 99999
4 8500 1537 119 89 216 80
4 7000 3133 16 4 221 70
4 5000 5779 -132 -182 249 90
4 4000 7444 -240 -314 262 90
4 3000 9469 -414 99999 272 120
4 2500 10682 -511 99999 289 130
4 2000 12097 -608 99999 291 150
4 1500 13868 -630 99999 291 160
4 1000 16400 -611 99999 298 110
我想重新组织数据,以便将第一天的数据减少为:
0 1 JUN 1957 9780 353 234 105 230 40
12 1 JUN 1957 9770 353 168 113 135 40
为此,我需要以2:254开头以“ 254”开头的行的单元格和以2:7开头以“ 9”开头的行的单元格。
我开发了以下代码,但是它甚至没有通过for循环的第一次迭代中的第一个if语句。也许这是数据类型或索引问题?
leng <- dim(Raw.Data)[1]
Processed.Data <- as.data.frame(matrix(0,ncol = 10, nrow = 42000))
i <- 1:leng
count <- 1
for (i in 1:leng){
if(Raw.for.R[i,1]==254){
Surface.Obs[count,1:4]<-Raw.for.R[i,2:5]
} else if(Raw.or.R$V1[i,1]==9){
Surface.Obs[count,5:10]<-Raw.for.R[i,2:7]
}
count <- count +1
}
运行代码时,我收到以下警告消息:
1: In if (Raw.Data[i, 1] == 254) { :
the condition has length > 1 and only the first element will be used
2: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 1 has 709400 rows to replace 1 rows
3: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 2 has 709400 rows to replace 1 rows
4: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 3 has 709400 rows to replace 1 rows
5: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 4 has 709400 rows to replace 1 rows
6: In `[<-.factor`(`*tmp*`, iseq, value = 99L) :
invalid factor level, NA generated
7: In `[<-.factor`(`*tmp*`, iseq, value = 3503L) :
invalid factor level, NA generated
8: In `[<-.factor`(`*tmp*`, iseq, value = 1092L) :
invalid factor level, NA generated
只要能解决我的许多问题之一,我们将不胜感激!
P.S。如果您对如何为缺失的日期插入空白行有一些想法,以后可能会为我省去一个额外的问题。
谢谢!
埃文