我必须对此“ plt”列表进行子集化。 “ Plt”是带有日期和时间的GPS点列表。 “标签”是一天中所有行程的列表,包括开始时间和结束时间。
我将从labels$Start
的第1行中取点,从labels$End
的第1行中取点,在plt$Data_Time
列中搜索这些值,并将起始值和最终值。
> str(labels)
'data.frame': 10 obs. of 8 variables:
$ Date_ST: Factor w/ 5 levels "2008/04/28","2008/04/29",..: 1 1 2 2 3 3 4 4 5 5
$ Time_ST: Factor w/ 15 levels "01:27:05","01:33:29",..: 13 15 4 10 1 7 8 12 2 11
$ Date_ET: Factor w/ 5 levels "2008/04/28","2008/04/29",..: 1 1 2 2 3 3 4 4 5 5
$ Time_ET: Factor w/ 15 levels "01:35:25","01:41:11",..: 13 15 3 10 1 5 6 12 2 9
$ Mode : Factor w/ 2 levels "subway","walk": 2 2 2 2 2 2 2 2 2 2
$ ID : int 1 3 4 6 7 9 10 12 13 15
$ Start : chr "2008/04/28 11:27:42" "2008/04/28 11:42:56" "2008/04/29 01:38:21" "2008/04/29 01:57:55" ...
$ End : chr "2008/04/28 11:27:58" "2008/04/28 11:50:10" "2008/04/29 01:41:28" "2008/04/29 02:03:28" ...
> str(plt)
'data.frame': 4377 obs. of 9 variables:
$ Lat : num 40.1 40.1 40.1 40.1 40.1 ...
$ Long : num 116 116 116 116 116 ...
$ X0 : int 0 0 0 0 0 0 0 0 0 0 ...
$ Alt : int 492 492 491 491 491 490 490 490 489 489 ...
$ n.days : num 39589 39589 39589 39589 39589 ...
$ Date : Factor w/ 5 levels "2008-05-21","2008-04-28",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Time : Factor w/ 2955 levels "01:33:29","01:33:30",..: 1 2 3 4 5 6 7 8 9 10 ...
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ Data_Time: chr "2008-05-21 01:33:29" "2008-05-21 01:33:30" "2008-05-21 01:33:31" "2008-05-21 01:33:33" ...
head(plt)
Lat Long X0 Alt n.days Date Time ID Data_Time
1 40.07045 116.3130 0 492 39589.06 2008-05-21 01:33:29 1 2008-05-21 01:33:29
2 40.07045 116.3133 0 492 39589.06 2008-05-21 01:33:30 2 2008-05-21 01:33:30
3 40.07050 116.3131 0 491 39589.06 2008-05-21 01:33:31 3 2008-05-21 01:33:31
4 40.07052 116.3130 0 491 39589.06 2008-05-21 01:33:33 4 2008-05-21 01:33:33
5 40.07050 116.3129 0 491 39589.06 2008-05-21 01:33:35 5 2008-05-21 01:33:35
6 40.07047 116.3129 0 490 39589.07 2008-05-21 01:33:37 6 2008-05-21 01:33:37
labels
Date_ST Time_ST Date_ET Time_ET Mode ID Start End
1 2008/04/28 11:27:42 2008/04/28 11:27:58 walk 1 2008/04/28 11:27:42 2008/04/28 11:27:58
3 2008/04/28 11:42:56 2008/04/28 11:50:10 walk 3 2008/04/28 11:42:56 2008/04/28 11:50:10
4 2008/04/29 01:38:21 2008/04/29 01:41:28 walk 4 2008/04/29 01:38:21 2008/04/29 01:41:28
6 2008/04/29 01:57:55 2008/04/29 02:03:28 walk 6 2008/04/29 01:57:55 2008/04/29 02:03:28
7 2008/05/12 01:27:05 2008/05/12 01:35:25 walk 7 2008/05/12 01:27:05 2008/05/12 01:35:25
9 2008/05/12 01:51:11 2008/05/12 01:55:35 walk 9 2008/05/12 01:51:11 2008/05/12 01:55:35
for(i in 1:nrow(labels)) {
a = labels$Start[i] #prendo coord inizio/fine percorso
b = labels$End[i]
k = plt[plt$Data_Time >= a & plt$Data_Time < b, ]
LatLong = k[1:2]
head(LatLong)
write.table(LatLong, "~/Desktop/LatLongTrip.txt", sep="\t")
不幸的是,结果是:
> k = plt[plt$Data_Time >= b & plt$Data_Time < a, ]
> k
[1] Lat Long X0 Alt n.days Date Time ID Data_Time
<0 rows> (or 0-length row.names)
答案 0 :(得分:1)
您不需要for循环:) 在这里:
首先请确保具有库sqldf
然后,设置一个模拟数据示例:
fechasInicioYFin <- data.frame(
fechasInicio = as.POSIXct(c('2016-08-19 10:00','2016-08-25 15:00','2016-09-15 15:00','2016-07-20 11:00')),
fechasFin = as.POSIXct(c('2016-08-19 14:00','2016-08-25 18:00','2016-09-15 19:00','2016-07-20 16:00'))
)
dataConFecha <- data.frame(num1 = c(1,2,3,4,5,6), num2 = c(11:16),
fechas = as.POSIXct(c('2016-08-19 12:00','2016-08-25 16:00','2016-09-15 16:00','2016-07-20 13:00',
'2016-08-19 13:00','2016-09-15 17:00'))
)
现在只需按日期列将其加入,然后仅选择您感兴趣的列即可。
sqldf("select a.*,b.fechasInicio,b.fechasFin from dataConFecha as a join fechasInicioYFin as b on
a.fechas between b.fechasInicio and b.fechasFin")
**使用“介于” sql语句代替@ =和<=,如@G所示。格洛腾迪克
如您所见,数据现在基本上按开始日期和结束日期分组。