提高R中日期时间比较的性能

时间:2018-04-23 12:19:44

标签: r

背景 对于我的论文,我有几百个大的CSV文件。这些文件包含时间序列,在2016年11月1日至2012年3月1日期间,天气参数的半小时周期为23800。 此外,每个时期有20或21个运行时间(数字不固定,这在我的情况下基本上是问题)。运行时标记计算预测时某个预测期的时间。因此,预测主要是在预测时间之前计算出来的(这自然是有意义的)但是,无论出于何种原因,情况并非总是如此。对于某些时段(大多数但并不总是)在上午9:00到凌晨12:00之间,每个时段都有一个运行时间,将在未来计算。我不想拥有这个"未来"运行时(我无法理解为什么包含它)

数据的示例摘录:

+-----------------------+-----------------------+-------------+
|    ForecastPeriod     |        Runtime        |    Value    |
+-----------------------+-----------------------+-------------+
| …                     | …                     | …           |
| 02.11.2016 11:30+0000 | 31.10.2016 00:00+0000 | 5.544368776 |
| 02.11.2016 11:30+0000 | 31.10.2016 12:00+0000 | 4.71684533  |
| 02.11.2016 11:30+0000 | 01.11.2016 00:00+0000 | 5.374274986 |
| 02.11.2016 11:30+0000 | 01.11.2016 12:00+0000 | 5.892114875 |
| 02.11.2016 11:30+0000 | 02.11.2016 00:00+0000 | 6.18387462  | <-i want this row
| 02.11.2016 11:30+0000 | 02.11.2016 12:00+0000 | 5.852306909 | <- don't make sense
| 02.11.2016 12:00+0000 | 23.10.2016 12:00+0000 | 14.81608444 |
| 02.11.2016 12:00+0000 | 24.10.2016 00:00+0000 | 3.637574565 |
| …                     | …                     | ...         |
| 02.11.2016 12:00+0000 | 01.11.2016 12:00+0000 | 5.541325144 |
| 02.11.2016 12:00+0000 | 02.11.2016 00:00+0000 | 5.745831136 | <- i want this row
| 02.11.2016 12:00+0000 | 02.11.2016 12:00+0000 | 5.347949883 | <- don't make sense
| 02.11.2016 12:30+0000 | 24.10.2016 00:00+0000 | 3.80366064  |
| 02.11.2016 12:30+0000 | 24.10.2016 12:00+0000 | 5.533042696 |
| …                     | …                     | …           |
| 02.11.2016 12:30+0000 | 01.11.2016 12:00+0000 | 5.429153394 |
| 02.11.2016 12:30+0000 | 02.11.2016 00:00+0000 | 5.580232543 |
| 02.11.2016 12:30+0000 | 02.11.2016 12:00+0000 | 5.266140403 | <- i want this row
| 02.11.2016 13:00+0000 | 24.10.2016 00:00+0000 | 3.969746715 | <- here is no "future" runtime
| 02.11.2016 13:00+0000 | 24.10.2016 12:00+0000 | 5.704328337 |
| …                     | …                     | …           |
+-----------------------+-----------------------+-------------+

现在我的工作解决方案: 我现在正在做的是,遍历大数据框并过滤符合我期望的数据。它有效,但在我的笔记本电脑上却很慢。 (花了将近一个小时来完成500.000行),我有大量的csv文件要经过... 我问自己,是否有可能更快地做到这一点?如果它们工作得更快,我也可以使用额外的R包。此外,我正在考虑将数据上传到更快的SQL Server;是否在SQL上更快地处理这样的日期比较任务?

#Some preliminary transformations for the comparable posixct format:    
LA_Date_EC$Forecast.Time<-as.POSIXlt(LA_Date_EC$Forecast.Time,format="%d.%m.%Y %H:%M+%S",tz="UTC")
LA_Date_EC$Runtime.Forecast<-as.POSIXlt(LA_Date_EC$Runtime.Forecast,format="%d.%m.%Y %H:%M+%S",tz="UTC")


test.df2<-data.frame()
names(test.df2)<-names(LA_Date_EC) ##info: Datetimes 
for (l in 2:469501){

  if(LA_Date_EC[l,1]!=LA_Date_EC[l+1,1]){
    #print(l)

    if(LA_Date_EC[l,2]>=LA_Date_EC[l,1]){
      test.df2<-rbind.data.frame(test.df2,LA_Date_EC[l-1,])  

    }else{
      test.df2<-rbind.data.frame(test.df2,LA_Date_EC[l,])  
    }

  }

}

编辑:示例摘录为R:

中的输入输出
structure(list(Forecast.Time = structure(list(sec = c(0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), min = c(30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 30L, 30L, 30L, 30L), hour = c(10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L), mday = c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), mon = c(10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L
), year = c(116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
116L, 116L), wday = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), 
    yday = c(306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 306L, 
    306L, 306L, 306L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("sec", "min", 
"hour", "mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXlt", 
"POSIXt"), tzone = "UTC"), Runtime.Forecast = structure(list(
    sec = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 12L, 
    0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 12L, 
    0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 
    12L, 0L, 12L, 0L, 12L, 0L, 12L, 12L, 0L, 12L, 0L, 12L, 0L, 
    12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 
    0L, 12L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 
    0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 
    12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 
    0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 
    12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 0L, 12L, 
    0L, 12L), mday = c(27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 
    31L, 31L, 1L, 1L, 2L, 2L, 23L, 24L, 24L, 25L, 25L, 26L, 26L, 
    27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 31L, 31L, 1L, 1L, 
    2L, 2L, 23L, 24L, 24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 
    28L, 29L, 29L, 30L, 30L, 31L, 31L, 1L, 1L, 2L, 2L, 23L, 24L, 
    24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 28L, 29L, 29L, 30L, 
    30L, 31L, 31L, 1L, 1L, 2L, 2L, 24L, 24L, 25L, 25L, 26L, 26L, 
    27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 31L, 31L, 1L, 1L, 
    2L, 2L, 24L, 24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 28L, 
    29L, 29L, 30L, 30L, 31L, 31L, 1L, 1L, 2L, 2L, 24L, 24L, 25L, 
    25L), mon = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 
    10L, 10L, 10L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 
    10L, 10L, 10L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 
    10L, 10L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 9L, 9L, 9L, 9L), year = c(116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 
    116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L, 116L
    ), wday = c(4L, 4L, 5L, 5L, 6L, 6L, 0L, 0L, 1L, 1L, 2L, 2L, 
    3L, 3L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 
    0L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 
    4L, 4L, 5L, 5L, 6L, 6L, 0L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 0L, 
    1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 0L, 0L, 1L, 
    1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 
    6L, 6L, 0L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 
    3L, 4L, 4L, 5L, 5L, 6L, 6L, 0L, 0L, 1L, 1L, 2L, 2L, 3L, 3L, 
    1L, 1L, 2L, 2L), yday = c(300L, 300L, 301L, 301L, 302L, 302L, 
    303L, 303L, 304L, 304L, 305L, 305L, 306L, 306L, 296L, 297L, 
    297L, 298L, 298L, 299L, 299L, 300L, 300L, 301L, 301L, 302L, 
    302L, 303L, 303L, 304L, 304L, 305L, 305L, 306L, 306L, 296L, 
    297L, 297L, 298L, 298L, 299L, 299L, 300L, 300L, 301L, 301L, 
    302L, 302L, 303L, 303L, 304L, 304L, 305L, 305L, 306L, 306L, 
    296L, 297L, 297L, 298L, 298L, 299L, 299L, 300L, 300L, 301L, 
    301L, 302L, 302L, 303L, 303L, 304L, 304L, 305L, 305L, 306L, 
    306L, 297L, 297L, 298L, 298L, 299L, 299L, 300L, 300L, 301L, 
    301L, 302L, 302L, 303L, 303L, 304L, 304L, 305L, 305L, 306L, 
    306L, 297L, 297L, 298L, 298L, 299L, 299L, 300L, 300L, 301L, 
    301L, 302L, 302L, 303L, 303L, 304L, 304L, 305L, 305L, 306L, 
    306L, 297L, 297L, 298L, 298L), isdst = c(0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("sec", 
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"
), class = c("POSIXlt", "POSIXt"), tzone = "UTC"), Wind.Speed = c(12.0889469204481, 
8.1534483762018, 11.229031832199, 9.51623004872928, 7.99700924410322, 
8.06420698869646, 7.46190421726437, 7.95691440205356, 8.19089703425263, 
7.50023772800533, 7.46471349405832, 7.87503218264228, 8.10704533381368, 
8.25997087655172, 12.9641999142878, 5.95739166070848, 8.48709144265445, 
12.3686489749888, 3.27377438788927, 3.8132283355639, 5.40513611081943, 
12.3699466614361, 7.91484229489558, 11.0188269744693, 9.56301437212706, 
7.91921747636113, 7.86903553214633, 7.4208161449472, 7.7049451673898, 
8.02618971449148, 7.32764074016071, 7.26610021373866, 7.70408467708526, 
7.90262085370489, 8.1065215773556, 13.5223226992998, 6.23422083753905, 
8.45254447734072, 12.3148462884236, 3.00095427388565, 4.11192170847009, 
5.35120820775642, 12.6509464024241, 7.67623621358937, 10.8086221167397, 
9.60979869552483, 7.84142570861904, 7.67386407559621, 7.37972807263003, 
7.45297593272604, 7.86148239473033, 7.15504375231609, 7.06748693341901, 
7.53313717152826, 7.69819637359609, 7.95307227815948, 14.0804454843119, 
6.51105001436962, 8.41799751202699, 12.2610436018583, 2.72813415988203, 
4.41061508137628, 5.2972803046934, 12.9319461434121, 7.43763013228316, 
10.59841725901, 9.65658301892261, 7.76363394087695, 7.47869261904608, 
7.33864000031286, 7.20100669806228, 7.69677507496918, 6.98244676447147, 
6.86887365309936, 7.36218966597125, 7.4937718934873, 7.79962297896336, 
6.47610742140243, 8.44809907428991, 12.1754295175459, 2.80289574128868, 
4.43071887689015, 5.25901387681356, 12.8048270636345, 7.77257529660677, 
10.6689406707837, 9.67371178278272, 7.71232463800448, 7.639287068313, 
7.37678432847625, 7.39386920787284, 7.65056355621861, 7.0961073828294, 
6.94340806177623, 7.41655132109855, 7.53010844435008, 7.89628470472931, 
6.44116482843524, 8.47820063655284, 12.0898154332335, 2.87765732269532, 
4.450822672404, 5.22074744893371, 12.6777079838569, 8.10752046093037, 
10.7394640825575, 9.69084054664283, 7.66101533513201, 7.79988151757991, 
7.41492865663964, 7.58673171768341, 7.60435203746804, 7.20976800118733, 
7.01794247045311, 7.47091297622585, 7.56644499521287, 7.99294643049527, 
6.40622223546805, 8.50830219881576, 12.0042013489211, 2.95241890410197
)), .Names = c("Forecast.Time", "Runtime.Forecast", "Wind.Speed"
), row.names = 1400:1520, class = "data.frame")

2 个答案:

答案 0 :(得分:2)

您可以使用dplyrdata.table执行此操作。 data.table应该是您的最快解决方案。

<强> dplyr

library(dplyr)

df$Forecast.Time <- as.POSIXct(df$Forecast.Time)
df$Runtime.Forecast <- as.POSIXct(df$Runtime.Forecast)

filtered <- df %>% filter(Forecast.Time > Runtime.Forecast) %>%
     group_by(Forecast.Time) %>%
     summarise_all(funs(last))

<强> data.table

library(data.table)

df_dt <- as.data.table(df)

filtered_dt <- dat_dt[Forecast.Time > Runtime.Forecast, lapply(.SD, last), by = Forecast.Time]

答案 1 :(得分:1)

这是使用dplyr软件包的潜在解决方案。使用超前/滞后功能和group_by消除了循环。

正如我在上面的评论中提到的,我将日期/时间转换为POSIXct对象。

library(dplyr)
#df is a copy of the orginal data
df<-LA_Date_EC
#find all future values and remove them from the data
future<-LA_Date_EC[,2]>=lag(LA_Date_EC[,1])  
future[1]<-FALSE
df<-df[!future,]

#Group by the Forecast time and then find the last row
answer<-df %>% group_by(Forecast.Time) %>%
  summarize(Runtime.Forecas= last(Runtime.Forecast), Wind.Speed = last(Wind.Speed))