使用时间序列中的NA行识别和填充数据间隙

时间:2016-02-10 10:29:05

标签: r loops time-series missing-data

我的数据框d仅包含HrMnTempdates。我创建了Time,它给出了观察的预期时间间隔,Gdates给出了观察的预期日期,check查看记录和预期的日期和时间是否匹配使用;

R<-c("0","30","100","130","200","230","300","330","400","430","500","530","600","630","700","730","800","830",
                          "900","930","1000","1030","1100","1130","1200","1230","1300","1330","1400","1430","1500","1530","1600","1630",
                          "1700","1730","1800","1830","1900","1930","2000","2030","2100","2130","2200","2230","2300","2330")
d$Time<-rep(R,len=NROW(d))
start <- as.POSIXlt("2014-03-01")
interval <- 30
end <- start + as.difftime(31, units="days")
test<-seq(from=start, by=interval*60, to=end)
d$Gdates<-rep(substring(test,1,10),len=NROW(d))
d$check<-d$HrMn==d$Time & d$Gdates==d$dates

数据框看起来像这样;

    HrMn    Temp    Time    Gdates      dates       check
 100    11      0       07-03-14    07-03-14    FALSE
 130    11      30      07-03-14    07-03-14    FALSE
 230    12      100     07-03-14    07-03-14    FALSE
 300    14      130     07-03-14    07-03-14    FALSE
 330    15      200     07-03-14    07-03-14    FALSE
 430    17      230     07-03-14    07-03-14    FALSE
 500    19      300     07-03-14    07-03-14    FALSE
 530    20      330     07-03-14    07-03-14    FALSE
 600    21      400     07-03-14    07-03-14    FALSE
 700    22      430     07-03-14    07-03-14    FALSE
 730    23      500     07-03-14    07-03-14    FALSE
 800    23      530     07-03-14    07-03-14    FALSE
 830    24      600     07-03-14    07-03-14    FALSE
 900    25      630     07-03-14    07-03-14    FALSE
 930    25      700     07-03-14    07-03-14    FALSE
1000    25      730     07-03-14    07-03-14    FALSE
1030    25      800     07-03-14    07-03-14    FALSE
1100    25      830     07-03-14    07-03-14    FALSE
1130    25      900     07-03-14    07-03-14    FALSE
1200    24      930     07-03-14    07-03-14    FALSE
1230    23      1000    07-03-14    07-03-14    FALSE
1300    22      1030    07-03-14    07-03-14    FALSE
1330    21      1100    07-03-14    07-03-14    FALSE
1400    20      1130    07-03-14    07-03-14    FALSE
1500    18      1200    07-03-14    07-03-14    FALSE
1530    18      1230    07-03-14    07-03-14    FALSE
1600    18      1300    07-03-14    07-03-14    FALSE
1630    18      1330    07-03-14    07-03-14    FALSE
1700    17      1400    07-03-14    07-03-14    FALSE
1730    17      1430    07-03-14    07-03-14    FALSE
1800    17      1500    07-03-14    07-03-14    FALSE
1830    17      1530    07-03-14    07-03-14    FALSE
1900    17      1600    07-03-14    07-03-14    FALSE
1930    16      1630    07-03-14    07-03-14    FALSE
2000    16      1700    07-03-14    07-03-14    FALSE
2030    15      1730    07-03-14    07-03-14    FALSE
2100    15      1800    07-03-14    07-03-14    FALSE
2130    15      1830    07-03-14    07-03-14    FALSE
2200    14      1900    07-03-14    07-03-14    FALSE
2300    13      1930    07-03-14    07-03-14    FALSE
 100    12      2000    07-03-14    08-03-14    FALSE
 130    12      2030    07-03-14    08-03-14    FALSE
 330    17      2100    07-03-14    08-03-14    FALSE
 430    21      2130    07-03-14    08-03-14    FALSE
 530    23      2200    07-03-14    08-03-14    FALSE
 600    24      2230    07-03-14    08-03-14    FALSE
 630    24      2300    07-03-14    08-03-14    FALSE
 700    25      2330    07-03-14    08-03-14    FALSE

我使用以下while循环来识别数据间隙并在那里插入NA行。

while (d$check=="FALSE"){
  checkd<-which(d$check == "FALSE")
  d<-d[ c( 0:(checkd[1]-1), NA, checkd[1]:NROW(d) ), ] #adding a row for missing value
  d$Time<-rep(R,len=NROW(d))
  start <- as.POSIXlt("2014-03-01")
  interval <- 30
  end <- start + as.difftime(31, units="days")
  test<-seq(from=start, by=interval*60, to=end)
  d$Gdates<-rep(substring(test,1,10),len=NROW(d))
  d$check<-d$HrMn==d$Time & d$Gdates==d$dates
}

但它只添加了一行并发出以下警告;

Error in while (d$check == "FALSE") { : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In while (d$check == "FALSE") { :
  the condition has length > 1 and only the first element will be used
2: In while (d$check == "FALSE") { :
  the condition has length > 1 and only the first element will be used

我该如何解决这个问题?

0 个答案:

没有答案