我正在尝试使用Dask python包加入一些地理数据帧。 在实现我的数据处理算法时,我遇到了下一个异常: AttributeError:'DataFrame'对象没有属性'_example'
这是我的代码:
import dask.dataframe as dd
import dask_geopandas as dg
import pandas as pd
import dask
df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")
df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)
gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')
这是我的stacktrace:
Traceback (most recent call last):
File "test.py", line 21, in <module>
gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')
File "/usr/local/lib/python3.6/dist-packages/dask_geopandas-0.0.1-py3.6.egg/dask_geopandas/core.py", line 413, in sjoin
example = gpd.tools.sjoin(left._example, right._example, how=how, op=op)
File "/home/mapseeuser/.local/lib/python3.6/site-packages/dask/dataframe/core.py", line 2414, in __getattr__
raise AttributeError("'DataFrame' object has no attribute %r" % key)
AttributeError: 'DataFrame' object has no attribute '_example'
那么,任何人都可以告诉我我做错了什么以及如何使用Dask包库加入两个数据集。
答案 0 :(得分:0)
Python包库:
sudo pip install dask[dataframe]
sudo pip install geopandas
试用此代码:
import dask.dataframe as dd
import geopandas as gpd
import pandas as pd
import dask
df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")
df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)
gf2 = gpd.sjoin(df1, df2, how='inner', op='intersects')