通过spark2.0.0提交任务,为什么要运行spark版本1.5.2?

时间:2016-08-03 06:54:25

标签: apache-spark pyspark mesos

我尝试将spark1.5.2更新为spark2.0.0,在两台机器上测试(node3,node7),我通过spark2.0.0 / spark-submit提交任务,但任务将在spark 1.5.2中运行< /强>

我在node3提交任务时遇到错误

~/software/spark-2.0.0-bin-hadoop2.6/bin$ spark-submit --master mesos://192.168.1.5050  ../examples/src/main/python/pimy.py

mesos executores stderr登录node7

sh: 1: /home/jianxun/software/spark-1.5.2-bin-hadoop2.6/bin/spark-class: not found

node3:JDK:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

node3 / etc / profile

export M2_HOME=/usr/share/maven
export M2=$M2_HOME/bin
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
export PATH=/home/jianxun/software/mongodb-linux-x86_64-3.2.0/bin:$PATH
export HIVE_HOME=/home/jianxun/software/apache-hive-2.0.1-bin
export PATH=$HIVE_HOME/bin:$PATH
export CLASSPATH=$CLASSPATH:/usr/share/java/mysql.jar
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6

node7:JDK:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

node3 / etc / profile

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6
export PYTHONPATH=/usr/lib/python2.7

Mesos verson是0.25, Mesos master是node3,只有一个Mesos slave是node7。 node3有两个版本的spark:

  1. 〜/软件/火花2.0.0彬hadoop2.6 /
  2. 〜/软件/火花1.5.2彬hadoop2.6 /
  3. node3中的spark配置:

    spark-env.sh

    export MESOS_NATIVE_JAVA_LIBRARY=/home/jianxun/software/mesos/lib/libmesos-0.25.0.so
    export SCALA_HOME=/usr/share/scala-2.11
    export SPARK_EXCUTOR_URI=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz
    

    火花defaults.conf

    spark.local.dir                    /data/sparktmp
    spark.shuffle.service.enabled      true
    spark.mesos.coarse                 true
    spark.executor.memory              24g
    spark.executor.cores               7
    spark.cores.max                    7
    spark.executor.uri                 /home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz
    

    node7只有新版本的spark:

    1. 〜/软件/火花2.0.0彬hadoop2.6 /
    2. 〜/ software / spark-2.0.0-bin-hadoop2.6.tgz(二进制文件)
    3. spark-submit log :(重要的部分由****计算)

      *********************************************************
      *********************************************************
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-examples-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      16/08/03 12:31:33 INFO SparkContext: Running Spark version 1.5.2
      *****************************************************************
      *****************************************************************
      16/08/03 12:31:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      16/08/03 12:31:34 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
      16/08/03 12:31:34 INFO SecurityManager: Changing view acls to: jianxun
      16/08/03 12:31:34 INFO SecurityManager: Changing modify acls to: jianxun
      16/08/03 12:31:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jianxun); users with modify permissions: Set(jianxun)
      16/08/03 12:31:34 INFO Slf4jLogger: Slf4jLogger started
      16/08/03 12:31:34 INFO Remoting: Starting remoting
      16/08/03 12:31:34 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.203:40978]
      16/08/03 12:31:34 INFO Utils: Successfully started service 'sparkDriver' on port 40978.
      16/08/03 12:31:34 INFO SparkEnv: Registering MapOutputTracker
      16/08/03 12:31:34 INFO SparkEnv: Registering BlockManagerMaster
      16/08/03 12:31:34 INFO DiskBlockManager: Created local directory at /data/sparktmp/blockmgr-76944d0c-de18-4f52-9249-8c3ca6141f59
      16/08/03 12:31:34 INFO MemoryStore: MemoryStore started with capacity 12.4 GB
      16/08/03 12:31:34 INFO HttpFileServer: HTTP File server directory is /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/httpd-a64948d7-9e78-42f0-b711-84fc5f040517
      16/08/03 12:31:34 INFO HttpServer: Starting HTTP Server
      16/08/03 12:31:35 INFO Utils: Successfully started service 'HTTP file server' on port 35616.
      16/08/03 12:31:35 INFO SparkEnv: Registering OutputCommitCoordinator
      16/08/03 12:31:35 INFO Utils: Successfully started service 'SparkUI' on port 4040.
      16/08/03 12:31:35 INFO SparkUI: Started SparkUI at http://192.168.1.203:4040
      16/08/03 12:31:35 INFO Utils: Copying /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py to /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/userFiles-03a46142-7a44-43d0-82de-10c174721a99/pimy.py
      16/08/03 12:31:35 INFO SparkContext: Added file file:/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py at http://192.168.1.203:35616/files/pimy.py with timestamp 1470198695252
      16/08/03 12:31:35 WARN SparkContext: Using SPARK_MEM to set amount of memory to use per executor process is deprecated, please use spark.executor.memory instead.
      16/08/03 12:31:35 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
      I0803 12:31:35.419636 32575 sched.cpp:164] Version: 0.25.0
      I0803 12:31:35.430359 32570 sched.cpp:262] New master detected at master@192.168.1.203:5050
      I0803 12:31:35.431447 32570 sched.cpp:272] No credentials provided. Attempting to register without authentication
      I0803 12:31:35.434844 32570 sched.cpp:641] Framework registered with ff2cf87e-3712-413f-a452-6d71430527bc-0012
      16/08/03 12:31:35 INFO MesosSchedulerBackend: Registered as framework ID ff2cf87e-3712-413f-a452-6d71430527bc-0012
      16/08/03 12:31:35 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41218.
      16/08/03 12:31:35 INFO NettyBlockTransferService: Server created on 41218
      16/08/03 12:31:35 INFO BlockManagerMaster: Trying to register BlockManager
      16/08/03 12:31:35 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.203:41218 with 12.4 GB RAM, BlockManagerId(driver, 192.168.1.203, 41218)
      16/08/03 12:31:35 INFO BlockManagerMaster: Registered BlockManager
      16/08/03 12:31:36 INFO SparkContext: Starting job: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38
      16/08/03 12:31:36 INFO DAGScheduler: Got job 0 (reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38) with 2 output partitions
      16/08/03 12:31:36 INFO DAGScheduler: Final stage: ResultStage 0(reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38)
      16/08/03 12:31:36 INFO DAGScheduler: Parents of final stage: List()
      16/08/03 12:31:36 INFO DAGScheduler: Missing parents: List()
      16/08/03 12:31:36 INFO DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38), which has no missing parents
      16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(4272) called with curMem=0, maxMem=13335873454
      16/08/03 12:31:36 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.2 KB, free 12.4 GB)
      16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(2792) called with curMem=4272, maxMem=13335873454
      ....
      ....
      16/08/03 12:31:37 INFO DAGScheduler: Job 0 failed: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38, took 1.002633 s
      Traceback (most recent call last):
        File "/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py", line 38, in <module>
          count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
        File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 799, in reduce
        File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 773, in collect
        File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
        File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
      py4j.protocol.Py4JJavaError16/08/03 12:31:37 INFO DAGScheduler: Executor lost: ff2cf87e-3712-413f-a452-6d71430527bc-S4 (epoch 3)
      : An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
      : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 7, node7): ExecutorLostFailure (executor ff2cf87e-3712-413f-a452-6d71430527bc-S4lost)
      Driver stacktrace:
              at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
              at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
              at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
              at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
              at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
              at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
              at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
              at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
              at scala.Option.foreach(Option.scala:236)
              at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
              at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
              at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
              at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
              at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
              at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
              at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
              at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
              at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
              at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
              at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
              at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
              at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
              at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
              at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
              at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
              at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
              at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
              at py4j.Gateway.invoke(Gateway.java:259)
              at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
              at py4j.commands.CallCommand.execute(CallCommand.java:79)
              at py4j.GatewayConnection.run(GatewayConnection.java:207)
              at java.lang.Thread.run(Thread.java:745)
      16/08/03 12:31:37 INFO BlockManagerMasterEndpoint: Trying to remove executor ff2cf87e-3712-413f-a452-6d71430527bc-S4 from BlockManagerMaster.
      16/08/03 12:31:37 INFO BlockManagerMaster: Removed ff2cf87e-3712-413f-a452-6d71430527bc-S4 successfully in removeExecutor
      16/08/03 12:31:37 INFO DAGScheduler: Host added was in lost list earlier: node7
      16/08/03 12:31:37 INFO SparkContext: Invoking stop() from shutdown hook
      16/08/03 12:31:37 INFO SparkUI: Stopped Spark web UI at http://192.168.1.203:4040
      16/08/03 12:31:37 INFO DAGScheduler: Stopping DAGScheduler
      I0803 12:31:37.146209 32592 sched.cpp:1771] Asked to stop the driver
      I0803 12:31:37.146414 32573 sched.cpp:1040] Stopping framework 'ff2cf87e-3712-413f-a452-6d71430527bc-0012'
      16/08/03 12:31:37 INFO MesosSchedulerBackend: driver.run() returned with code DRIVER_STOPPED
      16/08/03 12:31:37 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
      16/08/03 12:31:37 INFO MemoryStore: MemoryStore cleared
      16/08/03 12:31:37 INFO BlockManager: BlockManager stopped
      16/08/03 12:31:37 INFO BlockManagerMaster: BlockManagerMaster stopped
      16/08/03 12:31:37 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
      16/08/03 12:31:37 INFO SparkContext: Successfully stopped SparkContext
      16/08/03 12:31:37 INFO ShutdownHookManager: Shutdown hook called
      16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/pyspark-02048aa7-deaf-4af5-adde-86732cd44324
      16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24
      

      mesos.Warning登录node7

      Log file created at: 2016/08/03 12:31:36
      Running on machine: node7
      Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
      W0803 12:31:36.408701  5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5
      W0803 12:31:36.409050  5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5
      W0803 12:31:36.613108  5687 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093
      W0803 12:31:36.613817  5691 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093
      W0803 12:31:36.807909  5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea
      W0803 12:31:36.808281  5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea
      W0803 12:31:37.019579  5687 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa
      W0803 12:31:37.020051  5693 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa
      W0803 12:31:37.142438  5690 slave.cpp:1995] Cannot shut down unknown framework ff2cf87e-3712-413f-a452-6d71430527bc-0012
      

1 个答案:

答案 0 :(得分:0)

执行/ etc / profile文件。