Question

“ PicklingError：无法序列化对象：异常：看来您正在尝试从广播变量，操作或转换引用SparkContext。SparkContext只能在驱动程序上使用，而不能在其在工作程序上运行的代码中使用。有关更多信息，请参阅SPARK-5063。“

当我尝试运行我的代码时，出现上述错误。我希望此示例可以解释我要执行的操作..我有cons文件，其中所有列名都在这里，我正在尝试为我从上一个输出中获得的结果添加标签。在udf_get_Status（examresults_table.marks_obtained）中出现错误此行不确定我是否尝试udf_get_Status（cons.COL_marks_obtained）仍然相同

def get_status(self,marks_obtained)
   marks_obtained = int(marks_obtained)
   if marks_obtained == 0:
      return "failed"
   if marks_obtained > 0 and marks_obtained <35:
      return "pass"
   else:
      return "good_grade"

def create marks_sheet_table(self,student_information,examresults):
   student_information = student_information.select(cons.COL_NAME ,cons.COL_ID)
   examresults = examresults.select(cons.COL_ID ,cons.COL_marks_obtained)
   examresults_table = student_information.join(student_information['COL_ID'])
   udf_get_Status = udf(self.get_status,stringtype())
   result_status_label = examresults_table.withcolumn(cons.COL_status_label,udf_get_Status(examresults_table.marks_obtained))

PicklingError：无法序列化对象：异常：您似乎正在尝试从广播变量引用SparkContext

0 个答案: