我有一个sf数据框,其中包含从一个自行车站到另一个自行车站的行程次数。几何列包含两个阳离子之间的直接路径(由osm给出)。
我想绘制一张地图,街道的颜色是渐变的行程数。
我的问题是我有多少次路线而不是街道。
我使用st_interception()
提取两条路线的相同部分,并使用st_difference()
来提取差异。
这里有10行和15行的两行是我想要的。
library('sf')
library('ggplot2') # dev version
route1 <- st_linestring(rbind(c(0, 0), c(1, 1), c(2, 2), c(3, 3)))
route2 <- st_linestring(rbind(c(1, 0), c(1, 1), c(2, 2), c(3, 0)))
route1 <- st_sf(id = 1, trips = 10, geometry = st_sfc(route1))
route2 <- st_sf(id = 2, trips = 15, geometry = st_sfc(route2))
# not ok as the segment (1,1 to 2,2) that is supposed to have 25 trips only has 15 (the number of trips for the second line plotted)
ggplot(data = rbind(route1, route2)) + geom_sf(mapping = aes(color = trips)) +
theme(panel.grid.major = element_line(colour = 'transparent'))
# mergeRoutes gives the desired output
route <- mergeRoutes(route1, route2, init = TRUE)
ggplot(data = route) + geom_sf(mapping = aes(color = trips)) +
theme(panel.grid.major = element_line(colour = 'transparent'))
我编写了函数mergeRoute,它给出了我想要的两条路线,但它不能很好地扩展到很多很多路线。
#'
#' This function merges two routes. It returns the interscetion (if any) with the number
#' of associated count and also the remaining pars of the routes or the second route or
#' (if init) the two routes.
#'
#' @param route1 a row with id , count and geometry
#' @param route2 a row with id , count and geometry
#' @param init logical, whether to return the two routes even if there is no intersection
#' or only the second one
#'
#' @return a data frame with 3 rows if there is an intersection, nothing otherwise.
#'
mergeRoutes <- function(route1, route2, init = FALSE)
{
intersection <- st_intersection(route1$geometry, route2$geometry)
# if the intersection is only points or is empty then the result is the two routes
# untouched to avoid adding too many elements to the result
if(length(intersection) != 0 &
!'sfc_POINT' %in% class(intersection) &
!'sfc_MULTIPOINT' %in% class(intersection)) {
# if intersection is a geometry with point and lines extract the lines only
intersection <- st_collection_extract(x = intersection, type = "LINESTRING")
count <- route1$count + route2$count
intersection <- data.frame(id = route1$id, count = count, geometry = intersection)
# keep the part of the initial routes that are not in the intersection
route1_dif <- st_difference(route1$geometry, route2$geometry)
route2_dif <- st_difference(route2$geometry, route1$geometry)
# if one route is completely covered by the the other then it is not added to the result
if(length(route1_dif) != 0) {
route1 <- data.frame(id = route1$id,
count = route1$count,
geometry = route1_dif)
} else {
route1 <- NULL
}
if(length(route2_dif) != 0) {
route2 <- data.frame(id = route2$id,
count = route2$count,
geometry = route2_dif)
} else {
route2 <- NULL
}
result <- rbind(intersection, route1, route2)
return(result)
} else if (init) {
result <- rbind(route1, route2)
} else {
result <- route2
}
return(result)
}
所以我有两条线可以工作,但如果我试图遍历所有站点之间的所有路线,它就会无限期地进行。我不能在for循环中找到比lapply()
更好的方法,并且这不会在我的mac(16gb ram,2.5 ghz)上终止,即使在运行15小时之后它也会在某些时候冻结。
这是我尝试使用近2000条路线(可以找到数据here)。
# To merge all the routes, each new route is compared to all the rows from the previous
# comparison. New rows are added to the resulting data frame at each step. If there is no
# intersection then the route being compared to the others is added untouched.
# initiate comparison
segment_routes <- mergeRoutes(route1 = directions %>% slice(1),
route2 = directions %>% slice(2),
init = TRUE)
# compute directions segmentation for all the routes
for(i in 3:nrow(directions)) {
new_route <- directions %>% slice(i)
# compare the new route to a the segments resulting fro mprevious comparison
new_routes <- lapply(X = seq(nrow(segment_routes)),
FUN = function(j) mergeRoutes(route1 = segment_routes %>% slice(j),
route2 = new_route))
new_routes <- do.call(rbind, new_routes)
# make an sf object
new_routes <- st_sf(new_routes,
geometry = new_routes$geometry,
crs = st_crs(directions))
# add the new segemnts to the ones from the previous iteration
segment_routes <- rbind(segment_routes, new_routes)
}
我知道你可以直接将数据框传递给st_intersection()
,但我不知道如何指定我想要添加计数,而且超过2条路线可以共享街道的同一部分,所以a单次调用拦截将无法提供正确的输出。
我在这里使用sf
和数据框,但使用sp
和/或data.table
或其他套餐的任何解决方案对我来说都是完美的。
非常感谢任何帮助。
编辑:这是我的会话信息
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_2.2.1.9000 sf_0.6-0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 class_7.3-14 withr_2.1.1.9000 plyr_1.8.4
[5] grid_3.4.3 gtable_0.2.0 DBI_0.7 magrittr_1.5
[9] e1071_1.6-8 units_0.5-1 scales_0.5.0.9000 pillar_1.2.1
[13] rlang_0.2.0 lazyeval_0.2.1 tools_3.4.3 udunits2_0.13
[17] munsell_0.4.3 yaml_2.1.17 compiler_3.4.3 colorspace_1.3-2
[21] classInt_0.1-24 tibble_1.4.2
答案 0 :(得分:3)
假设您的所有路线都是LINESTRINGS
,并且LINESTRING
只是一系列坐标,我们可以将每个连续坐标对视为&#39;来自&#39;和&#39;到&#39;。
如果我们使用data.table
来存储坐标(而不是sf
),那么解决方案就变成了一个简单的分组&amp;计数操作,应该可以很好地扩展到更大的数据集。
以下是您在链接
中提供的数据的示例 第1步 - 转换为data.table
library(sf)
library(data.table)
library(googleway) ## for plotting. can also use ggplot2, ggmap, leaflet, mapview...
sf <- readRDS("~/Downloads/directions.rds")
sf$row_id <- 1:nrow(sf) ## for joining
dt_routes <- as.data.table(st_coordinates(sf))
## put on the rest of the trip data
## this assumes the 'L1' value from `st_coordinates` matches the
## `id` value from the sf_routes object
## (if not, you will need a sequential 1:nrow 'id' value to match the
## 'L1' value)
dt_sf <- sf
st_geometry(dt_sf) <- NULL
dt_routes <- dt_routes[
dt_sf
, on = c(L1 = "row_id")
, nomatch = 0
]
第2步 - 从&#39;创建&#39;和&#39;
我们可以移动X和Y列,以便从&#39;和&#39;到&#39;列
dt_routes[
, `:=`(X_to = shift(X, type = "lead"),
Y_to = shift(Y, type = "lead"))
, by = L1
]
第3步 - 群组&amp;计数
现在我们可以计算每个坐标对的跳闸次数
dt_trips <- dt_routes[
!is.na(X_to)
, .(n_trips = sum(count))
, by = .(X, Y, X_to, Y_to)
]
第4步 - 转换回sf
经过一些重新排列后,我们现在可以将每个从/到对转换为LINESTRINGS
,每个都有自己的权重(即num_trips
)
dt_trips[, line_id := .I]
dt_from <- dt_trips[, .(X, Y, n_trips, line_id)]
dt_to <- dt_trips[, .(X = X_to, Y = Y_to, n_trips, line_id)]
dt_from[, line_sequence := 1]
dt_to[, line_sequence := 2]
dt_trips <- rbindlist(list(
dt_from, dt_to
))
setorder(dt_trips, line_id, line_sequence)
## convert back to `sf` object
dt_trips <- dt_trips[, {
geometry <- sf::st_linestring(x = matrix(c(X, Y), ncol = 2))
geometry <- sf::st_sfc(geometry)
geometry <- sf::st_sf(geometry)
}, by = .(line_id, n_trips)]
sf_trips <- sf::st_as_sf(dt_trips)
第5步 - 绘图
## applying a log-transform so the contrast shows up
sf_trips$n_trips <- log(sf_trips$n_trips)
library(googleway)
set_key("GOOGLE_MAP_KEY")
google_map(data = sf_trips) %>%
add_polylines(
stroke_colour = "n_trips"
, stroke_opacity =1
, stroke_weight = 3.5
#, legend = T
, info_window = "n_trips"
, palette = viridisLite::viridis
)