应用错误收集

我正在尝试在Azure数据砖中编写一个函数。我想在函数内部使用spark.sql。但是看来我无法在辅助节点上使用它。

def SEL_ID(value, index):
    # some processing on value here
    ans = spark.sql("SELECT id FROM table WHERE bin = index")
    return ans
spark.udf.register("SEL_ID", SEL_ID)

我收到以下错误： PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

有什么办法可以克服这个问题？我正在使用上面的功能从另一个表中进行选择。

异常：您似乎正在尝试从广播变量，操作或转换中引用SparkContext

0 个答案: