AttributeError:类型对象“ weakref”没有属性“ __callback__”

时间:2019-03-25 23:31:40

标签: pyspark user-defined-functions

具有纬度经度列的数据框。使用udf创建新列,检查相应的邮政编码是否在邮政编码列表内。

根据在search内部还是外部定义in_borough,我得到不同的结果。

from uszipcode import SearchEngine
from pyspark.sql.functions import udf


search = SearchEngine(simple_zipcode=True)


lat= 40.77898
long = -73.96925
zipcodes = [10023]


def in_borough(lat, long):
    # search = SearchEngine(simple_zipcode=True)
    result = search.by_coordinates(lat, long, radius=1,returns=1)

    return (int(result[0].zipcode) in zipcodes) if (len(result) > 0) else False


within_borough = udf(in_borough, BooleanType())

df = spark.createDataFrame([{'lat': lat, 'long': long}])


# calling the function
print(in_borough(lat, long))


# calling the udf
df.select('lat', 'long', within_borough('lat', 'long').alias('within_borough')).show()

函数的输出:

True

在{strong>内部函数中定义了search的udf输出:

+--------+---------+--------------+
|     lat|     long|within_borough|
+--------+---------+--------------+
|40.77898|-73.96925|          True|
+--------+---------+--------------+

在{strong>外部函数中定义了search的udf输出:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-165-bbbd186ea9d1> in <module>
    61 
    62 # calling the udf
---> 63 df.select('lat', 'long', within_borough('lat', 'long').alias('within_borough')).show()
[...]
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling o3794.showString.
[...]
AttributeError: type object 'weakref' has no attribute '__callback__'
[...]

0 个答案:

没有答案