TensorFlow On Spark:无法pickle本地对象循环

时间:2017-08-16 06:54:54

标签: apache-spark tensorflow parallel-processing pyspark pickle

我有一个独立的Spark群集,并尝试使用Python在其上运行TensorFlow On Spark。到目前为止,我刚刚尝试了非常简单的示例,但我总是遇到同样的问题:每个工作程序都崩溃并出现相同的错误消息:

AttributeError: Can't pickle local object 'start.<locals>.<lambda>'

分配了一名新工作人员,我最终陷入无限Waiting for reservations...循环。我的程序中没有明确的酸洗,所以我猜它必须是TensorFlow On Spark管道的一部分。没有Spark包装器,我的TensorFlow应用程序运行正常。我已经在Windows 7和CentOS Linux 7上测试过这两种行为。

(差不多)完整错误输出如下:

17/08/15 16:40:36 ERROR Executor: Exception in task 0.2 in stage 0.0 (TID 5)
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "D:\Spark\python\lib\pyspark.zip\pyspark\worker.py", line 177, in main
  File "D:\Spark\python\lib\pyspark.zip\pyspark\worker.py", line 172, in process
  File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
    return func(split, prev_func(split, iterator))
  File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
    return func(split, prev_func(split, iterator))
  File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
    return func(split, prev_func(split, iterator))
  File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 346, in func
    return f(iterator)
  File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 794, in func
    r = f(it)
  File "C:\Program Files\Anaconda3\lib\site-packages\tensorflowonspark\TFSparkNode.py", line 290, in _mapfn
    TFSparkNode.mgr = TFManager.start(authkey, ['control'], 'remote')
  File "C:\Program Files\Anaconda3\lib\site-packages\tensorflowonspark\TFManager.py", line 41, in start
    mgr.start()
  File "C:\Program Files\Anaconda3\lib\multiprocessing\managers.py", line 513, in start
    self._process.start()
  File "C:\Program Files\Anaconda3\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Program Files\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Program Files\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'start.<locals>.<lambda>'
    at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
    at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
    at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)

似乎这个问题是known,但没有解决方案。任何提示都会受到赞赏,因为我没有想法。

0 个答案:

没有答案