在df中,我只想保留intersect_street
与streets
中包含的街道名称匹配的那些行,同时还要将已删除行的intersection_distance_meters
添加到其上方的行中
df
> streets
[1] "FRONT ST" "2ND ST" "3RD ST" "4TH ST"
> df
intersection segment_key intersection_distance_meters intersect_street
1 ARCH ST & FRONT ST 1EW 81 FRONT ST
2 ARCH ST & MASCHER ST 2EW 60 MASCHER ST
3 ARCH ST & 2ND ST 3EW 57 2ND ST
4 ARCH ST & LITTLE BOYS CT 4EW 28 LITTLE BOYS CT
5 ARCH ST & BREAD ST 5EW 83 BREAD ST
6 ARCH ST & 3RD ST 6EW 135 3RD ST
7 ARCH ST & 4TH ST 7EW 144 4TH ST
所需的输出
intersection segment_key intersection_distance_meters intersect_street
1 ARCH ST & FRONT ST 1EW 141 FRONT ST
2 ARCH ST & 2ND ST 3EW 168 2ND ST
3 ARCH ST & 3RD ST 6EW 135 3RD ST
4 ARCH ST & 4TH ST 7EW 144 4TH ST
我一直在使用dplyr中的lead()
将下一行的intersect_street
和intersection_distance_meters
添加为新列,然后有条件地对其进行求和,但是在那里我遇到了问题是一行中的多个非主要交叉点(例如,上面的第4和第5行)。
数据
df <- structure(list(intersection = c("ARCH ST & FRONT ST", "ARCH ST & MASCHER ST",
"ARCH ST & 2ND ST", "ARCH ST & LITTLE BOYS CT", "ARCH ST & BREAD ST",
"ARCH ST & 3RD ST", "ARCH ST & 4TH ST"), segment_key = c("1EW",
"2EW", "3EW", "4EW", "5EW", "6EW", "7EW"), intersection_distance_meters = c(81,
60, 57, 28, 83, 135, 144), intersect_street = c("FRONT ST", "MASCHER ST",
"2ND ST", "LITTLE BOYS CT", "BREAD ST", "3RD ST", "4TH ST")), row.names = c(NA,
7L), class = "data.frame")
streets <- c("FRONT ST", "2ND ST", "3RD ST", "4TH ST")
答案 0 :(得分:1)
我想这就是你想要的。我创建了一些额外的帮助器列---我把它们留在了里面,所以逻辑很清楚。
df %>% mutate(
keep = intersect_street %in% streets,
grouper = cumsum(keep)
) %>%
group_by(grouper) %>%
mutate(total_intersection_dist = sum(intersection_distance_meters)) %>%
slice(1)
# # A tibble: 4 x 7
# # Groups: grouper [4]
# intersection segment_key intersection_distance_met~ intersect_street keep grouper total_intersection_di~
# <chr> <chr> <dbl> <chr> <lgl> <int> <dbl>
# 1 ARCH ST & FRONT ST 1EW 81 FRONT ST TRUE 1 141
# 2 ARCH ST & 2ND ST 3EW 57 2ND ST TRUE 2 168
# 3 ARCH ST & 3RD ST 6EW 135 3RD ST TRUE 3 135
# 4 ARCH ST & 4TH ST 7EW 144 4TH ST TRUE 4 144