迭代累积和

时间:2015-07-24 18:25:12

标签: r data.table

以下表示我的问题,并允许一个易于重现的数据集:

如果我想确切地知道在长途旅行中需要停在哪里,我可以计算每英里累计使用的燃油加仑数。我认为我的车辆的燃油经济性是我的速度的函数,以55英里/小时的速度优化,我自己1加仑用于紧急搜索加油站。

例如:

require(data.table)
distance = c(1:3000)
speed = runif(length(distance),25,80)

avg_mpg = max(mtcars$mpg)
slow_mpg = ((55-speed)^(1/2))
fast_mpg = ((speed-55)^(1/2))
mpg = ifelse(speed<55, avg_mpg-slow_mpg, avg_mpg-fast_mpg)

gpm = 1 / mpg
tank_min = 1
full_tank = 13
fuel_use_function = function(x,y) x - cumsum(y)
fuel_use = fuel_use_function(full_tank, gpm)

trip_info = data.table(Mile_marker = distance,
                       Average_speed = speed,
                       Miles_per_gallon = mpg, 
                       Gallons_per_mile = gpm,
                       Fuel_use = fuel_use)
> trip_info
      Mile_marker Average_speed Miles_per_gallon Gallons_per_mile  Fuel_use
   1:           1      73.58330         29.58916       0.03379616  12.96620
   2:           2      55.09657         33.58925       0.02977143  12.93643
   3:           3      40.63387         30.10973       0.03321185  12.90322
   4:           4      69.86081         30.04503       0.03328338  12.86994
   5:           5      62.52757         31.15636       0.03209618  12.83784
  ---                                                                      
2996:        2996      62.83864         31.10024       0.03215409 -85.66442
2997:        2997      31.03928         29.00503       0.03447678 -85.69889
2998:        2998      30.81609         28.98229       0.03450383 -85.73340
2999:        2999      31.53397         29.05583       0.03441651 -85.76781
3000:        3000      34.27376         29.34739       0.03407458 -85.80189

我想在trip_info添加另一列,记录哪些&#34; leg&#34;我所经历的旅程(即我目前正在使用哪种燃气罐:第一,第二,第三......等)。我可以像这样手动完成:

first_tank_fuel_use = fuel_use_function(full_tank, trip_info$Gallons_per_mile)
first_tank_range = which(first_tank_fuel_use<tank_min)[1]
first_tank = rep(1,first_tank_range)
next_tanks = rep(2,nrow(trip_info) - length(first_tank))
tanks = c(first_tank, next_tanks)
trip_info$Tank_number = tanks

second_tank_fuel_use = 
  fuel_use_function(full_tank, trip_info$Gallons_per_mile[trip_info$Tank_number==2])
second_tank_range = which(second_tank_fuel_use<tank_min)[1]
second_tank = rep(2,second_tank_range)
tanks = c(first_tank, second_tank)
next_tanks = rep(3,nrow(trip_info) - length(tanks))
tanks = c(tanks, next_tanks)
trip_info$Tank_number = tanks

third_tank_fuel_use = ...

如何自动执行此“坦克号”索引&#39;矢量化格式的生成器?

1 个答案:

答案 0 :(得分:1)

我认为这可能就像计算tank=cumsum(Gallons_per_mile) %/% (full_tank - tank_min)一样简单,以指示您正在使用哪个坦克,然后过滤非零diff(tank)个数字。

会引入一些舍入误差,使停止点的估计过于保守,但“tank_min”提供的摆动空间应足以在实践中实现这一点。