这是Geospatial Analytics in Python
的后续问题这个问题类似于https://gis.stackexchange.com/questions/84114/shapely-unable-to-tell-if-polygon-contains-point,但是逆转lat似乎解决了这个问题,但它并没有帮助我。
Uber数据以纬度/长度给出,这是非常直接的,反向查找给出了在geopandas中使用的地址
然而问题是在shapefile中查找lat / long。优步数据如下:
Trip ID DateTime Stamp Lat Long
00001 2007-01-07T10:56:46+00:00 37.786117 -122.440119
00001 2007-01-07T10:56:50+00:00 37.786564 -122.440209
00001 2007-01-07T10:56:54+00:00 37.786905 -122.440270
00001 2007-01-07T10:56:58+00:00 37.786956 -122.440279
00002 2007-01-06T06:22:35+00:00 37.800224 -122.433520
00002 2007-01-06T06:22:39+00:00 37.800155 -122.434101
00002 2007-01-06T06:22:43+00:00 37.800160 -122.434430
00002 2007-01-06T06:22:47+00:00 37.800378 -122.434527
00002 2007-01-06T06:22:51+00:00 37.800738 -122.434598
00002 2007-01-06T06:22:55+00:00 37.800938 -122.434650
00002 2007-01-06T06:22:59+00:00 37.801024 -122.434889
00002 2007-01-06T06:23:03+00:00 37.800955 -122.435392
00002 2007-01-06T06:23:07+00:00 37.800886 -122.435959
00002 2007-01-06T06:23:11+00:00 37.800811 -122.436275
形状文件多边形边界看起来像
(5979385.645656899, 2110931.7279282957, 5988491.7629433125, 2116394.4427246302)
(5996757.772329897, 2104615.921334222, 6002126.622484565, 2111141.524096638)
(5994970.50687556, 2086244.426253125, 6004106.84030889, 2096245.441356048)
(6005060.663860559, 2117913.4127838016, 6010794.38500464, 2123410.4359104633)
(5999414.325087652, 2098231.5748509616, 6005330.746325642, 2103724.0536953807)
(5990180.636205971, 2101104.2121503055, 5997586.527562141, 2107405.9502029717)
(6005605.349122897, 2109599.6380728036, 6010954.164540723, 2115863.756778136)
(5997399.803198054, 2095859.3430468887, 6002045.244038388, 2100357.5978298783)
(6018974.499877974, 2121660.499777794, 6024740.999827892, 2131294.0001958013)
(5980891.2469763905, 2086337.3158311248, 5992333.58203131, 2097376.2589762956)
(5979838.815354228, 2109536.4948263764, 5990061.512428477, 2115435.3563882113)
(5996370.188459396, 2086085.1349050552, 6006040.649761483, 2089160.6310506314)
(6000325.210404977, 2087887.1444243789, 6011873.615785807, 2095773.4459089637)
(5980631.069675222, 2095815.8703648, 5992293.742215976, 2101164.5775151253)
(6010609.867329061, 2112785.889902383, 6015766.567317471, 2119365.8508238047)
(5991138.3905240595, 2086268.6489737183, 5998688.01650089, 2094657.981276378)
(6004790.221816152, 2100493.380038634, 6011576.786655068, 2109303.3404370546)
(5991505.183097556, 2091674.2884248793, 6000205.414384723, 2102574.580600634)
所以polygon / polygon.contains(point)中的点不起作用。看看数据,lat长度比较匀称的文件非常小,我不确定我是否要将一个单位转换为另一个单位,看起来完全不同的公制系统:)下面是代码:
import fiona
import shapely
from shapely.geometry import Point
import geopy
from geopy.geocoders import Nominatim
from shapely.geometry import shape
fc = fiona.open('/home/user/geo/sfo_shapefile/planning_neighborhoods.shp')
print fc.schema
pol = fc.next()
for f in fc:
print shape(f['geometry']).bounds
geom = shape(pol['geometry'])
print "Bigger poly shape" ,shape(pol['geometry']).bounds
geolocator = Nominatim()
for cords in open('/home/user/geo/uber/trips.tsv'):
latlong = cords.split('\t')
p = Point(float(latlong[3]),float(latlong[2]))
p = Point(float(37.783383),float(-122.439594))
if geom.contains(p):
print geolocator.reverse(p).address
Uber数据和SFO shapefile的链接在这里http://hortonworks.com/blog/magellan-geospatial-analytics-in-spark/#comment-606532