替代r中的数据帧的for循环

时间:2016-05-24 16:23:57

标签: r

编辑:

数据描述

Vehicle.ID2是车辆(汽车,卡车或摩托车)的唯一标识符。每个车辆的移动都记录在Time Time数据框的dt2列中。原始dt2在每个Time帧中包含超过40个变量,包括速度,加速度等。通过速度和时间,我使用r中的segmented包估算每个slopez的{​​{1}}(平滑加速度)。这些Vehicle.ID2slopez中的特定点之间保持不变,称为断点Time。为了澄清,我将包含2个唯一bp s的3个数据集:

Vehicle.ID2

目标

dt2 <- structure(list(Vehicle.ID2 = c("6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5"), Time = c(14.1, 14.2, 14.3, 14.4, 14.5, 14.6, 14.7, 14.8, 14.9, 15, 15.1, 15.2, 15.3, 15.4, 16.8, 16.9, 17, 17.1, 17.2, 17.3, 17.4, 17.5, 17.6, 17.7, 17.8, 17.9, 18, 18.1, 18.2, 22.5, 22.6, 22.7, 22.9, 23, 23.1, 23.2, 23.3, 23.4, 23.5, 23.6, 23.7, 23.8, 23.9, 24, 24.1, 24.2, 24.3, 24.4, 24.5, 24.6, 24.7, 24.8, 24.9, 25, 25.1, 25.2, 25.3, 25.4, 25.5, 25.6, 25.7, 25.8, 25.9, 26, 26.1, 26.2, 26.3, 26.4, 26.5, 26.6, 26.7, 26.8, 26.9, 27, 27.1, 27.2, 27.3, 27.4, 27.5, 27.6, 27.7, 27.8, 27.9, 28, 28.1, 28.2, 28.3, 28.4, 28.5, 28.6, 28.7, 28.8, 28.9, 29, 29.1, 29.2, 29.3, 29.4, 29.5, 29.6, 29.7, 29.8, 29.9, 30, 30.1, 30.2, 30.3, 30.4, 30.5, 30.6, 30.7, 30.8, 30.9, 31, 31.1, 31.2, 31.3, 31.4, 31.5, 31.6, 31.7, 31.8, 31.9, 32, 32.1, 32.2, 32.3, 32.4, 32.5, 32.6, 32.7, 32.8, 32.9, 33, 33.1, 33.2, 33.3, 33.4, 33.5, 33.6, 33.7, 33.8, 33.9, 34, 34.1, 34.2, 34.3, 34.4, 34.5, 34.6, 34.7, 34.8, 34.9, 35, 35.1, 35.2, 35.3, 35.4, 35.5, 35.6, 35.7, 35.8, 35.9, 36, 36.1, 36.2, 36.3, 36.4, 36.5, 36.6, 36.7, 36.8, 36.9, 37, 37.1, 37.2, 37.3, 37.4, 37.5, 37.6, 37.7, 37.8, 37.9, 38, 38.1, 38.2, 38.3, 38.4, 38.5, 38.6, 38.7, 38.8, 38.9, 39, 39.1, 39.2, 39.3, 39.4, 39.5, 39.6, 39.7, 39.8, 39.9, 40, 40.1, 40.2, 40.3, 40.4, 40.5, 40.6, 40.7, 40.8, 40.9, 41, 41.1, 41.2, 41.3, 41.4, 41.5, 41.6, 41.7, 41.8, 41.9, 42, 42.1, 42.2, 42.3, 42.4, 42.5, 42.6, 42.7, 42.8, 42.9, 43, 43.1, 43.2, 43.3, 43.4, 43.5, 43.6, 43.7, 43.8, 43.9, 44, 44.1, 44.2, 44.3, 44.4, 44.5, 44.6, 44.7, 44.8, 44.9, 45, 45.1, 45.2, 45.3, 45.4, 45.5, 45.6, 45.7, 45.8, 45.9, 46, 46.1, 46.2, 46.3, 46.4, 46.5, 46.6, 46.7, 46.8, 46.9, 47, 47.1, 47.2, 47.3, 47.4, 47.5, 47.6, 47.7, 47.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 11, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 11.8, 11.9, 12, 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, 12.7, 12.8, 12.9, 13, 13.1, 13.2, 13.3, 13.4, 13.5, 13.6, 13.7, 13.8, 13.9, 14, 14.1, 14.2, 14.3, 14.4, 14.5, 14.6, 14.7, 14.8, 14.9, 15, 15.1, 15.2, 15.3, 15.4, 15.5, 15.6, 15.7, 15.8, 15.9, 16, 16.1, 16.2, 16.3, 16.4, 16.5, 16.6, 16.7, 16.8, 16.9, 17, 17.1, 17.2, 17.3, 17.4, 17.5, 17.6, 17.7, 17.8, 17.9, 18, 18.1, 18.2, 18.3, 18.4, 18.5, 18.6, 18.7, 18.8, 18.9, 19, 19.1, 19.2, 19.3, 19.4, 19.5, 19.6, 19.7, 19.8, 19.9, 20, 20.1, 20.2, 20.3, 20.4, 20.5, 20.6, 20.7, 20.8, 20.9, 21, 21.1, 21.2, 21.3, 21.4, 21.5, 21.6, 21.7, 21.8, 21.9, 22, 22.1, 22.2, 22.3, 22.4, 22.5, 22.6, 22.7, 22.8, 22.9, 23, 23.1, 23.2, 23.3, 23.4, 23.5, 23.6, 23.7, 23.8, 23.9, 24, 24.1, 24.2, 24.3, 24.4, 24.5, 24.6, 24.7, 24.8, 24.9, 25, 25.1, 25.2, 25.3, 25.4, 25.5, 25.6, 25.7, 25.8, 25.9, 26, 26.1, 26.2, 26.3, 26.4, 26.5, 26.6, 26.7, 26.8, 26.9, 27, 27.1, 27.2, 27.3, 27.4, 27.5, 27.6, 27.7, 27.8, 27.9, 28, 28.1, 28.2, 28.3, 28.4, 28.5, 28.6, 28.7, 28.8, 28.9, 29, 29.1, 29.2, 29.3, 29.4, 29.5, 29.6, 29.7, 29.8, 29.9, 30, 30.1, 30.2, 30.3, 30.4, 30.5, 30.6, 30.7, 30.8, 30.9, 31, 31.1, 31.2, 31.3, 31.4, 31.5, 31.6, 31.7, 31.8, 31.9, 32, 32.1, 32.2, 32.3, 32.4, 32.5, 32.6, 32.7, 32.8, 32.9, 33, 33.1, 33.2, 33.3, 33.4, 33.5, 33.6, 33.7, 33.8, 33.9, 34, 34.1, 34.2, 34.3, 34.4, 34.5, 34.6, 34.7, 34.8, 34.9, 35, 35.1, 35.2, 35.3, 35.4, 35.5, 35.6, 35.7, 35.8, 35.9, 36, 36.1, 36.2, 36.3, 36.4, 36.5, 36.6, 36.7, 36.8, 36.9, 37, 37.1, 37.2, 37.3, 37.4, 37.5, 37.6, 37.7, 37.8, 37.9, 38, 38.1, 38.2, 38.3, 38.4, 38.5, 38.6, 38.7, 38.8, 38.9, 39, 39.1, 39.2, 39.3, 39.4, 39.5, 39.6, 39.7, 39.8, 39.9, 40, 40.1, 40.2, 40.3, 40.4, 40.5, 40.6, 40.7, 40.8, 40.9, 41, 41.1, 41.2, 41.3, 41.4, 41.5, 41.6, 41.7, 41.8, 41.9, 42, 42.1, 42.2, 42.3, 42.4, 42.5, 42.6, 42.7, 42.8, 42.9, 43, 43.1, 43.2, 43.3, 43.4, 43.5, 43.6, 43.7, 43.8, 43.9, 44, 44.1, 44.2, 44.3, 44.4, 44.5, 44.6, 44.7, 44.8, 44.9, 45, 45.1, 45.2, 45.3, 45.4, 45.5, 45.6, 45.7, 45.8, 45.9)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -713L), .Names = c("Vehicle.ID2", "Time")) dt_slopez <- structure(list(Vehicle.ID2 = c("6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5"), slopez = c(-1.707, -1.424, 1.414, -0.1184, 0.199, 0.6955, 0.6875, 0.1744, 0.6189, 0.06681, 1.606, 2.348, 0.632, 0.295, -2.298, 1.918, -0.7171, -1.506, 1.057, 0.1312, 1.589, 0.3844, 2.314, -0.6393, 2.054, 0.8238, 0.225, 0.7208, 0.489, 0.0303)), .Names = c("Vehicle.ID2", "slopez"), class = c("tbl_dt", "tbl", "data.table", "data.frame" ), row.names = c(NA, -30L)) dt_bp <- structure(list(Vehicle.ID2 = c("6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "6-4", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5", "8-5"), bp = c(15, 17.3, 18.2, 24.3, 28.4, 33, 38.1, 39.4, 41.3, 41.8, 43.4, 44.2, 45.7, 7.1, 9.4, 11.4, 13.1, 15.3, 24.3, 26.5, 30.4, 34.1, 36.8, 39.9, 40.4, 40.9, 41.8, 44.3)), .Names = c("Vehicle.ID2", "bp"), class = c("tbl_dt", "tbl", "data.table", "data.frame"), row.names = c(NA, -28L)) 是断点,我想在bp数据框的Time列中找到这些点。然后,我想在dt2中创建一个新列slope,该列在特定dt2之间取值slopez。例如,对于Time,第一个Vehicle.ID2=="6-4"为15,bp中的第一个条目为14.1。这意味着在14.1和15之间,Time应该是slope中的第一个值(对于此slopez),即-1.707;那么在15到17.3之间它应该是-1.424,依此类推 我之前问了这个问题并得到了答案here

for循环应用函数

在完整的数据集中有多个Vehicle.ID2。以下代码有效:

Vehicle.ID2

问题

上述代码确实实现了这些样本数据和原始数据(大小约为150 MB)的目标。但原始数据的速度很慢。此外,最终数据集datalist = list() for (i in unique(dt2$Vehicle.ID2)){ veh <- subset(dt2, Vehicle.ID2==i) ix <- sapply(veh$Time, function(z) which.max(z < c(dt_bp[dt_bp$Vehicle.ID2==i,]$bp, Inf))) veh$slope <- dt_slopez[dt_slopez$Vehicle.ID2==i,]$slopez[ix] datalist[[i]] <- veh } dt2 <- do.call(rbind, datalist) (具有多个dt2 s)无法接受,59.4 GB! 对于这些样本数据,这不是问题。但对于原始数据集来说太过分了。 Vehicle.ID2可以使用哪些替代方法来实现相同的目标,以便快速且最终数据集的大小很小?另外,我使用的是for loop错误,导致最终数据如此之大?

1 个答案:

答案 0 :(得分:0)

这是尝试使用基本R&n的剪切命令。

dt2$slope<-NA  #add placeholder column onto dataframe

for (i in unique(dt2$Vehicle.ID2)){
  #add higher and lower limt for the cut limits
  breaks<-c(0.0, dt_bp$bp[dt_bp$Vehicle.ID2==i], Inf) 
  #perform subset, cut and assignment
  dt2$slope[dt2$Vehicle.ID2==i]<- dt_slopez$slopez[cut(dt2$Time[dt2$Vehicle.ID2==i], breaks=breaks)]
}

是的,使用了一个for循环,但是循环的时间应该与执行子设置和赋值的时间无关。我相信有更好的方法 可能dplyr包可以执行连接??