我收到了 lists of coordinates in the csv file(请点击图片)。我应该如何将它们转换为 GeoDataFrame 中的多边形?
下面是一个多边形的坐标,我有数千行。
[118.103198,24.527338],[118.103224,24.527373],[118.103236,24.527366],[118.103209,24.527331],[118.103198,24.527338]
我尝试了以下代码:
def bike_fence_format(s):
s = s.replace('[', '').replace(']', '').split(',')
return s
df['FENCE_LOC'] = df['FENCE_LOC'].apply(bike_fence_format)
df['LAT'] = df['FENCE_LOC'].apply(lambda x: x[1::2])
df['LON'] = df['FENCE_LOC'].apply(lambda x: x[::2])
df['geom'] = Polygon(zip(df['LON'].astype(str),df['LAT'].astype(str)))
但是我在最后一步失败了,因为 df['LON'] 返回的是 'series' 而不是 'string' 类型。我应该如何克服这个问题?如果有更简单的方法来实现我的目标就更好了。
答案 0 :(得分:1)
重新创建 .csv 文件将提供的示例 df(取决于您使用 .read_csv() 读取它的方式)。
import pandas as pd
import geopandas as gpd
df = pd.DataFrame({'FENCE_LOC': ['[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]',
'[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]',
'[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]']}, index=[0, 1, 2])
稍微修改你的函数,因为我们想要数值,而不是字符串
def bike_fence_format(s):
s = s.replace('[', '').replace(']', '').split(',')
s = [float(x) for x in s]
return s
df['FENCE_LOC'] = df['FENCE_LOC'].apply(bike_fence_format)
df['LAT'] = df['FENCE_LOC'].apply(lambda x: x[1::2])
df['LON'] = df['FENCE_LOC'].apply(lambda x: x[::2])
我们可以使用一些列表推导式来构建一个 Shapely 多边形列表。
geom_list = [(x, y) for x, y in zip(df['LON'],df['LAT'])]
geom_list_2 = [Polygon(tuple(zip(x, y))) for x, y in geom_list]
最后,我们可以使用 Shapely 多边形列表创建一个 gdf。
polygon_gdf = gpd.GeoDataFrame(geometry=geom_list_2)
答案 1 :(得分:1)
为了提供类似于 OP 作为图像发布的内容的小型代表性数据集,我创建了以下数据行(抱歉十进制数字太多):
[[-2247824.100899419,-4996167.43201861],[-2247824.100899419,-4996067.43201861],[-2247724.100899419,-4996067.43201861],[-2247724.100899419,-4996167.43201861],[-2247824.100899419,-4996167.43201861]]
[[-2247724.100899419,-4996167.43201861],[-2247724.100899419,-4996067.43201861],[-2247624.100899419,-4996067.43201861],[-2247624.100899419,-4996167.43201861],[-2247724.100899419,-4996167.43201861]]
[[-2247624.100899419,-4996167.43201861],[-2247624.100899419,-4996067.43201861],[-2247524.100899419,-4996067.43201861],[-2247524.100899419,-4996167.43201861],[-2247624.100899419,-4996167.43201861]]
[[-2247824.100899419,-4996067.43201861],[-2247824.100899419,-4995967.43201861],[-2247724.100899419,-4995967.43201861],[-2247724.100899419,-4996067.43201861],[-2247824.100899419,-4996067.43201861]]
[[-2247724.100899419,-4996067.43201861],[-2247724.100899419,-4995967.43201861],[-2247624.100899419,-4995967.43201861],[-2247624.100899419,-4996067.43201861],[-2247724.100899419,-4996067.43201861]]
[[-2247624.100899419,-4996067.43201861],[-2247624.100899419,-4995967.43201861],[-2247524.100899419,-4995967.43201861],[-2247524.100899419,-4996067.43201861],[-2247624.100899419,-4996067.43201861]]
[[-2247824.100899419,-4995967.43201861],[-2247824.100899419,-4995867.43201861],[-2247724.100899419,-4995867.43201861],[-2247724.100899419,-4995967.43201861],[-2247824.100899419,-4995967.43201861]]
[[-2247724.100899419,-4995967.43201861],[-2247724.100899419,-4995867.43201861],[-2247624.100899419,-4995867.43201861],[-2247624.100899419,-4995967.43201861],[-2247724.100899419,-4995967.43201861]]
[[-2247624.100899419,-4995967.43201861],[-2247624.100899419,-4995867.43201861],[-2247524.100899419,-4995867.43201861],[-2247524.100899419,-4995967.43201861],[-2247624.100899419,-4995967.43201861]]
此数据保存为 polygon_data.csv
文件。
对于代码,模块首先加载为
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon
然后,通过pandas.read_csv()
读取数据以创建数据帧。要将每一行数据放入数据框的单列中,使用 delimiter="x"
。由于任何一行数据中都没有x
,所以整行数据作为一个长字符串就是结果。
df3 = pd.read_csv('polygon_data.csv', header=None, index_col=None, delimiter="x")
要查看df3
的内容,可以运行
df3.head()
并获取单列(标题:0)数据框:
0
0 [[-2247824.100899419,-4996167.43201861],[-2247...
1 [[-2247724.100899419,-4996167.43201861],[-2247...
2 [[-2247624.100899419,-4996167.43201861],[-2247...
3 [[-2247824.100899419,-4996067.43201861],[-2247...
4 [[-2247724.100899419,-4996067.43201861],[-2247...
接下来,df3
用于创建 geoDataFrame。 df3
的每一行中的数据用于创建一个 Polygon 对象,作为 geoDataFrame geometry
的 polygon_df3
。
geometry = [Polygon(eval(xy_string)) for xy_string in df3[0]]
polygon_df3 = gpd.GeoDataFrame(df3, \
#crs={'init': 'epsg:4326'}, #uncomment this if (x,y) is long/lat
geometry=geometry)
最后,可以使用一个简单的命令绘制 geoDataFrame:
# this plot the geoDataFrame
polygon_df3.plot(edgecolor='black')
在我建议的数据的这种特殊情况下,输出图是: