将几何转换为R

时间:2017-12-05 19:36:11

标签: r geometry

https://data.sfgov.org/Transportation/Bike-Share-Stations/gtyg-jpkj

我正在处理这个数据集,我想知道是否可以将几何体(表格中的Geom)转换为两列:R中的经度和纬度。

谢谢!

3 个答案:

答案 0 :(得分:1)

RSocrata::read.socratatidyr::extract简明扼要:

library(tidyverse)

df <- RSocrata::read.socrata('https://data.sfgov.org/Transportation/Bike-Share-Stations/gtyg-jpkj')

df <- df %>% extract(Geom, c('lat', 'lon'), '\\((.*), (.*)\\)', convert = TRUE) 

# print nicely
df %>% select(UID, Site.ID, lat, lon) %>% as_data_frame()
#> # A tibble: 107 x 4
#>      UID    Site.ID      lat       lon
#>  * <int>      <chr>    <dbl>     <dbl>
#>  1     1  SF-T24 S1 37.75182 -122.4266
#>  2     2  SF-G33 S1 37.79350 -122.3928
#>  3     3   SOMA-06A 37.78974 -122.3947
#>  4     4  SF-T22 S5 37.75128 -122.4318
#>  5     5  SF-R25 S4 37.75671 -122.4210
#>  6     6    NOMA-2E 37.79861 -122.4008
#>  7     7  SF-L33 S4 37.77590 -122.3932
#>  8     8  SF-O24 S4 37.76623 -122.4269
#>  9     9 Market-03B 37.78099 -122.4117
#> 10    10  SF-O28 S2 37.76723 -122.4108
#> # ... with 97 more rows

答案 1 :(得分:0)

是的。最简单的方法可能是使用tidyr包。这是单线:

library(tidyr)
df <- fread("~/Downloads/Bike_Share_Stations.csv") # Read data

extract(df, Geom, into = c('Lat', 'Lon'), '\\((.*),(.*)\\)', conv = T)

最后一个参数是使用组匹配的正则表达式。这是一个简单的模式:它以文字(开头。最内部的两个括号(.*)是逗号分隔的两个坐标。只提取这些。该模式以相应的文字)结束。

以下是结果数据的子集:

     UID    Site ID             Last Edited Date           Lat             Lon
  1:   1  SF-T24 S1 05/23/2016 12:00:00 AM +0000 37.7518243814  -122.426627114
  2:   2  SF-G33 S1 05/23/2016 12:00:00 AM +0000 37.7935049482  -122.392846514
  3:   3   SOMA-06A 05/23/2016 12:00:00 AM +0000 37.7897420277  -122.394678441
  4:   4  SF-T22 S5 05/23/2016 12:00:00 AM +0000 37.7512809413  -122.431836215
  5:   5  SF-R25 S4 05/23/2016 12:00:00 AM +0000 37.7567132725  -122.421038213
 ---                                                                          
103: 103     Embr-E 05/23/2016 12:00:00 AM +0000 37.8047749378  -122.403247294
104: 104  SF-N26 S1 05/23/2016 12:00:00 AM +0000 37.7682271629  -122.420291015
105: 105 Market-11B 05/23/2016 12:00:00 AM +0000 37.7922638478  -122.397066071
106: 106  SF-O27 S2 05/23/2016 12:00:00 AM +0000 37.7671609432  -122.415485214
107: 107  SF-T23 S5 05/23/2016 12:00:00 AM +0000 37.7514609421  -122.429135213

答案 2 :(得分:0)

我认为Geom列已包含纬度/经度。

library(tidyverse)

df <- df %>% 
  mutate(Geom = gsub('[()°]', '', Geom)) %>% 
  separate(col = Geom, into = c('Latitude', 'Longitude '), sep = '\\,')

首先,我们使用gsub('[()°]', '', geom)删除括号和度数符号,然后替换Geom列。然后我们separateGeom列添加到新的LatitudeLongitude列中,并使用逗号分隔符sep = '\\,'