我正在尝试进行一种减法运算,但是使用了两列。我想从序列的最后一站开始为每一行做(DistTravelValue-distBWStops)。
我首先按照递减的顺序排列小标题。我添加了DistTravelValue列,该列用于查找stop是否是最大停靠点(由我在上一步中确定),输入shape_dist,否则为0。
然后,我想用其distBWStops值减去上一行的DistTravelValue来找到每一行的DistTravelValue。我感觉这可能需要purrr,但我完全不知道如何进行。
样本数据:
trip_id seq shape_dist direction_id distBWStops MaxStop DistTravelValue
2139296 56 14.3937 0 0.255 56 14.3937
2139296 55 14.1387 0 0.2582 56 0
2139296 54 13.8805 0 0.6186 56 0
2139296 53 13.2619 0 0.1856 56 0
2139296 52 13.0763 0 0.165 56 0
2139296 51 12.9113 0 0.1326 56 0
所需的输出:
trip_id seq shape_dist direction_id distBWStops MaxStop DistTravelValue
2139296 56 14.3937 0 0.255 56 14.3937
2139296 55 14.1387 0 0.2582 56 14.1355
2139296 54 13.8805 0 0.6186 56 13.5169
2139296 53 13.2619 0 0.1856 56 13.3313
2139296 52 13.0763 0 0.165 56 13.1663
2139296 51 12.9113 0 0.1326 56 13.0337
我的新手尝试这样做:
tripsJoined6 <- inner_join(tripsJoined5, maxStopSequence) %>%
arrange(trip_id,
direction_id,
desc(seq)) %>%
group_by(trip_id, direction_id) %>%
mutate(DistTravelValue = ifelse(seq == MaxStop, shape_dist, 0)) %>%
mutate(
DistTravelValue = ifelse(
DistTravelValue > 0,
DistTravelValue,
DistTravelValue[i + 1] - distBWStops[i + 1]
)
)
DistTravelValue [i + 1]-distBWStops [i + 1]无法正常工作。
在此先感谢您!
答案 0 :(得分:2)
可以省略最后mutate
的排序和分组:
trips %>%
mutate(DistTravelValue = cumsum(c(first(DistTravelValue), -distBWStops[-1])))
给予:
trip_id seq shape_dist direction_id distBWStops MaxStop DistTravelValue
1 2139296 56 14.3937 0 0.2550 56 14.3937
2 2139296 55 14.1387 0 0.2582 56 14.1355
3 2139296 54 13.8805 0 0.6186 56 13.5169
4 2139296 53 13.2619 0 0.1856 56 13.3313
5 2139296 52 13.0763 0 0.1650 56 13.1663
6 2139296 51 12.9113 0 0.1326 56 13.0337
我们将其用作trips
trips <-
structure(list(trip_id = c(2139296L, 2139296L, 2139296L, 2139296L,
2139296L, 2139296L), seq = 56:51, shape_dist = c(14.3937, 14.1387,
13.8805, 13.2619, 13.0763, 12.9113), direction_id = c(0L, 0L,
0L, 0L, 0L, 0L), distBWStops = c(0.255, 0.2582, 0.6186, 0.1856,
0.165, 0.1326), MaxStop = c(56L, 56L, 56L, 56L, 56L, 56L),
DistTravelValue = c(14.3937,
0, 0, 0, 0, 0)), class = "data.frame", row.names = c(NA, -6L))