TensorFlow Spark-pyspark rdd转换返回RuntimeError:超出最大递归深度

时间:2017-06-04 08:48:32

标签: python apache-spark tensorflow pyspark

我尝试使用pyspark在图像集合中执行TensorFlow,但是当我在RDD中应用转换时,我收到此错误。 错误位于函数 run_inference_on_image 中 我使用“local”和“spark://master.spark.tfm:7077”获得错误

这是错误的痕迹:

error[E0599]: no method named `attach` found for type `rocket::Rocket` in the current scope
  --> src\main.rs:62:10
   |
62 |         .attach(Template::fairing())
   |          ^^^^^^

error[E0599]: no associated item named `fairing` found for type `rocket_contrib::Template` in the current scope
  --> src\main.rs:62:17
   |
62 |         .attach(Template::fairing())
   |                 ^^^^^^^^^^^^^^^^^

使用spark-submit启动,为我提供了此驱动程序堆栈跟踪

/usr/bin/python2.7 /home/utad/PycharmProjects/TensorFlowMirFlickr/ClasificacionImagenes.py
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/06/04 00:47:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
('Model already downloaded:', '/home/utad/TFM/model/inception-2015-12-05.tgz', posix.stat_result(st_mode=33204, st_ino=709599, st_dev=2049, st_nlink=1, st_uid=1000, st_gid=1000, st_size=88931400, st_atime=1496482158, st_mtime=1496482158, st_ctime=1496482158))
('rddImagenes: ', [['http://host.images.tfm:8000/mirflickr/im1.jpg'], ['http://host.images.tfm:8000/mirflickr/im10.jpg'], ['http://host.images.tfm:8000/mirflickr/im100.jpg']])
17/06/04 00:47:24 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1)
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 163, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 54, in read_command
    command = serializer._read_with_length(file)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 169, in _read_with_length
    return self.loads(obj)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 454, in loads
    return pickle.loads(obj)
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 118, in __getattr__
    _module = self._resolve()
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 115, in _resolve
    return _import_module(self.mod)
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 118, in __getattr__
    _module = self._resolve()
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 115, in _resolve
    return _import_module(self.mod)
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 118, in __getattr__
    _module = self._resolve()
...
...
...
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 115, in _resolve
    return _import_module(self.mod)
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 118, in __getattr__
    _module = self._resolve()
  File "/home/utad/.local/lib/python2.7/site-packages/six.py", line 115, in _resolve
    return _import_module(self.mod)
RuntimeError: maximum recursion depth exceeded

    at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
    at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
    at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

我的代码:

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1925)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1938)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1951)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1965)
    at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
    at org.apache.spark.rdd.RDD.collect(RDD.scala:935)
    at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
    at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:280)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)

请问有关正在发生的事情的任何建议吗?

修改:我刚刚发现问题是在调用“#!/usr/bin/env python # -*- coding: utf-8 -*- # Módulos a usar from pyspark import SparkContext import os.path from six.moves import urllib import tarfile import tensorflow as tf import re import numpy as np # ************************************************************************************** # Configuración proceso # Modelo lo bajamos de la red y lo guardamos en directorio local MODEL_URL = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz' model_dir = '/home/utad/TFM/model' # Simulamos que las imagenes las obtenermos de un servicio web IMAGES_INDEX_URL = 'http://host.images.tfm:8000/mirflickr/' # Otros datos numero_imagenes_proceso = 3 # Número total de imágenes a procesar lote_size = 1 # Número de imágenes por lote max_etiquetas = 5 # Número máximo de etiquetas por imagen # Fin Configuración proceso # ************************************************************************************** # Obtenemos modelo def get_tensorflow_model(): # Download and extract model tar file filename = MODEL_URL.split('/')[-1] filepath = os.path.join(model_dir, filename) if not os.path.exists(filepath): filepath2, _ = urllib.request.urlretrieve(MODEL_URL, filepath) print("filepath2", filepath2) statinfo = os.stat(filepath) print('Succesfully downloaded', filename, statinfo.st_size, 'bytes.') tarfile.open(filepath, 'r:gz').extractall(model_dir) else: print('Model already downloaded:', filepath, os.stat(filepath)) # Obtenida de classify_image.py del github de TensorFlow class NodeLookup(object): """Converts integer node IDs to human readable labels.""" def __init__(self, label_lookup_path=None, uid_lookup_path=None): if not label_lookup_path: label_lookup_path = os.path.join( model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt') if not uid_lookup_path: uid_lookup_path = os.path.join( model_dir, 'imagenet_synset_to_human_label_map.txt') self.node_lookup = self.load(label_lookup_path, uid_lookup_path) def load(self, label_lookup_path, uid_lookup_path): """Loads a human readable English name for each softmax node. Args: label_lookup_path: string UID to integer node ID. uid_lookup_path: string UID to human-readable string. Returns: dict from integer node ID to human-readable string. """ if not tf.gfile.Exists(uid_lookup_path): tf.logging.fatal('File does not exist %s', uid_lookup_path) if not tf.gfile.Exists(label_lookup_path): tf.logging.fatal('File does not exist %s', label_lookup_path) # Loads mapping from string UID to human-readable string proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines() uid_to_human = {} p = re.compile(r'[n\d]*[ \S,]*') for line in proto_as_ascii_lines: parsed_items = p.findall(line) uid = parsed_items[0] human_string = parsed_items[2] uid_to_human[uid] = human_string # Loads mapping from string UID to integer node ID. node_id_to_uid = {} proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines() for line in proto_as_ascii: if line.startswith(' target_class:'): target_class = int(line.split(': ')[1]) if line.startswith(' target_class_string:'): target_class_string = line.split(': ')[1] node_id_to_uid[target_class] = target_class_string[1:-2] # Loads the final mapping of integer node ID to human-readable string node_id_to_name = {} for key, val in node_id_to_uid.items(): if val not in uid_to_human: tf.logging.fatal('Failed to locate: %s', val) name = uid_to_human[val] node_id_to_name[key] = name return node_id_to_name def id_to_string(self, node_id): if node_id not in self.node_lookup: return '' return self.node_lookup[node_id] def run_inference_on_image(sess, image, lookup): """Hacemos inferencia sobre la imagen. Args: sess: TensorFlow Session image: Imagen a leer lookup: node lookup obtenido previamente Returns: (image ID, image URL, scores), where scores is a list of (human-readable node names, score) pairs """ image_data = urllib.request.urlopen(image).read() print("Image: ", image_data) # Some useful tensors: # 'softmax:0': A tensor containing the normalized prediction across # 1000 labels. # 'pool_3:0': A tensor containing the next-to-last layer containing 2048 # float description of the image. # 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG # encoding of the image. # Runs the softmax tensor by feeding the image_data as input to the graph. softmax_tensor = sess.graph.get_tensor_by_name('softmax:0') try: predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data}) print("predictions: ", predictions) except: # Handle problems with malformed JPEG files return image, None predictions = np.squeeze(predictions) top_k = predictions.argsort()[-max_etiquetas:][::-1] print("top_k predictions: ", top_k) scores = [] for node_id in top_k: if node_id not in lookup: human_string = '' else: human_string = lookup[node_id] score = predictions[node_id] scores.append((human_string, score)) print ("tupla: ", image, scores) return image, scores def apply_inference_on_batch(lote, lookup_bc): """Apply inference to a batch of images. We do not explicitly tell TensorFlow to use a GPU. It is able to choose between CPU and GPU based on its guess of which will be faster. """ with tf.Graph().as_default() as g: print("Apply") graph_def = tf.GraphDef() print("Graph_def: ", graph_def) graph_def.ParseFromString(model_data_bc.value) print("Graph_def1: ", graph_def) _ = tf.import_graph_def(graph_def, name='') print("TF: ", tf) with tf.Session() as sess: print("Sesion: ", sess) print("Lote: ", lote) print("Node_lookup: ", lookup_bc) labeled = [run_inference_on_image(sess, image, lookup_bc.value) for image in lote] return [tup for tup in labeled if tup[1] is not None] # Función auxiliar para obtener el nombre de una imagen. def obtener_nombre_imagen(x): return IMAGES_INDEX_URL + x.split('<')[1].split('>')[1] # Iniciamos SparkContext # sc = SparkContext('spark://master.spark.tfm:7077', 'TensorFlow') sc = SparkContext('local') get_tensorflow_model() # Cargamos el modelo y lo distribuimos model_path = os.path.join(model_dir, 'classify_image_graph_def.pb') with tf.gfile.FastGFile(model_path, 'rb') as f: model_data = f.read() model_data_bc = sc.broadcast(model_data) # Distribuimos node lookup para ser utilizado en los workers node_lookup = NodeLookup().node_lookup node_lookup_bc = sc.broadcast(node_lookup) # Obtenemos una lista de las imágenes a procesar y las agrupamos en lotes imagenes = urllib.request.urlopen(IMAGES_INDEX_URL).read().split('<li>')[2:numero_imagenes_proceso+2] lote_imagenes = [imagenes[i:i + lote_size] for i in range(0, len(imagenes), lote_size)] # Paralelizamos los lotes de imagenes y procesamos rddImagenes = sc.parallelize(lote_imagenes).map(lambda x: map(obtener_nombre_imagen, x)) print("rddImagenes: ", rddImagenes.collect()) imagenes_etiquetadas = rddImagenes.flatMap(lambda x: apply_inference_on_batch(x, node_lookup_bc)) # imagenes_etiquetadas = rddImagenes.flatMap(lambda x: x[0].split("/")) l = imagenes_etiquetadas.collect()

1 个答案:

答案 0 :(得分:0)

我想通了,或者至少我找到了解决方法。

我改变了:

from six.moves import urllib

有:

import urllib

我不知道为什么六个小便不能正常工作,但这解决了我的问题。