我正在尝试将我从NYC Open Data获得的shapefile转换为EPSG:4326,因此我可以将其与一些坐标一起使用以创建一个氯厚度。 shapefile位于EPSG:2263中,因此当我将其导入到GeoPandas
时,我尝试了以下操作:
# import shapefile data, convert to EPSG:4326
shapeData = gpd.read_file("~/Portfolio/nynta_19b/nynta.shp")
shapeData = shapeData.to_crs({ "init": "epsg:4326" })
print(shapeData.crs, "\n\n", shapeData.head())
{'init': 'epsg:4326'}
BoroCode BoroName CountyFIPS NTACode NTAName Shape_Leng \
0 3 Brooklyn 047 BK88 Borough Park 39247.228028
1 4 Queens 081 QN51 Murray Hill 33266.904995
2 4 Queens 081 QN27 East Elmhurst 19816.712293
3 4 Queens 081 QN07 Hollis 20976.335574
4 1 Manhattan 061 MN06 Manhattanville 17040.685413
Shape_Area geometry
0 5.400502e+07 POLYGON ((-73.97604935657382 40.63127590564677...
1 5.248828e+07 POLYGON ((-73.80379022888246 40.77561011179249...
2 1.972685e+07 POLYGON ((-73.86109724401859 40.76366447708767...
3 2.288777e+07 POLYGON ((-73.75725671509139 40.71813860166254...
4 1.064708e+07 POLYGON ((-73.94607828674226 40.82126321606192...
到目前为止,看起来还不错。然后,我加载以下计划用来填充从shapefile中获得的地图的数据:
# properly format latitude and longitude columns for geopandas join
geoData = gpd.GeoDataFrame(initialData,
crs={ "init": "epsg:4326" },
geometry=gpd.points_from_xy(initialData.longitude, initialData.latitude))
print(geoData.head())
cmplnt_num cmplnt_fr_dt cmplnt_fr_tm ky_cd pd_cd law_cat_cd boro_nm \
0 407134549 04/30/2011 23:35:00 235 511.0 MISDEMEANOR BROOKLYN
1 513417880 04/30/2011 23:34:00 105 397.0 FELONY BROOKLYN
2 501724753 04/30/2011 23:30:00 353 462.0 MISDEMEANOR BRONX
3 250100240 04/30/2011 23:30:00 578 638.0 VIOLATION BRONX
4 633777397 04/30/2011 23:30:00 347 905.0 MISDEMEANOR BROOKLYN
loc_of_occur_desc prem_typ_desc x_coord_cd ... \
0 INSIDE RESIDENCE - PUBLIC HOUSING 1002397.0 ...
1 NaN STREET 1012560.0 ...
2 FRONT OF STREET 1012918.0 ...
3 INSIDE RESIDENCE - APT. HOUSE 1018694.0 ...
4 NaN HIGHWAY/PARKWAY 987712.0 ...
susp_race susp_sex latitude longitude \
0 NaN NaN 40.688071 -73.934567
1 BLACK M 40.650501 -73.897978
2 NaN NaN 40.826971 -73.896414
3 WHITE HISPANIC M 40.825312 -73.875547
4 NaN NaN 40.585357 -73.987537
lat_lon patrol_boro vic_age_group \
0 (40.688070936, -73.934566849) PATROL BORO BKLYN NORTH NaN
1 (40.650501226, -73.897978491) PATROL BORO BKLYN SOUTH 25-44
2 (40.826971006, -73.896414356) PATROL BORO BRONX 45-64
3 (40.825311778, -73.875546821) PATROL BORO BRONX 25-44
4 (40.58535697, -73.987537333) PATROL BORO BKLYN SOUTH NaN
vic_race vic_sex geometry
0 UNKNOWN E POINT (-73.93456684899999 40.688070936)
1 BLACK M POINT (-73.897978491 40.650501226)
2 WHITE HISPANIC M POINT (-73.89641435600001 40.826971006)
3 WHITE HISPANIC M POINT (-73.875546821 40.825311778)
4 UNKNOWN E POINT (-73.98753733300001 40.58535697)
[5 rows x 22 columns]
但是,当我尝试使用我的坐标信息执行空间连接(将每组坐标与相应的邻域制表区域相关联)时,出现以下警告:
# execute spatial join of csv data into shapefile
combined = gpd.sjoin(shapeData, geoData, how="left", op="intersects")
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/numpy/lib/function_base.py:2167: RuntimeWarning: invalid value encountered in ? (vectorized)
outputs = ufunc(*inputs)
而且,显然在转换shapefile之后,当我查找其crs属性时,我得到以下信息:
{ "init": "epsg:4326", "no_defs": True }
所以我尝试将坐标的CRS属性设置为相同的内容,但仍然无法正常工作。当我尝试绘制无效输入或某些东西时,我会收到警告,并且在使笔记本电脑听起来像喷气发动机的半小时后,我仍然没有得到所要吸收的叶绿素。
如您所见,我尝试确保将两组数据正确格式化后再放在一起。我做错了什么?而我该如何解决呢?