执行mapPartition后无法收集Rdd:Pickling Error

时间:2018-06-11 00:43:48

标签: python market-basket-analysis

我正在尝试运行此代码:

MaptoChunk = basketchunks.mapPartitions(lambda x:Apriori(x,SupportThreshold))
for x in MaptoChunk.collect():
    print x

我有一个名为Apriori的功能,如下所示:

def Apriori(baskets,Support):

Itemsets=baskets.map(lambda x:x[1]) 
......(To count frequent itesets)
return(freqsets)  #this is an rdd 

但是当我运行代码时,它给了我一个错误:

PicklingError: Could not serialize object: Exception: It appears that you 
are attempting to reference SparkContext from a broadcast variable, action, 
or transformation. SparkContext can only be used on the driver, not in code 
that it run on workers. For more information, see SPARK-5063.

有人能告诉我如何保存每个分区的地图值: 我的代码有什么问题

0 个答案:

没有答案