Spark作业服务器启动但显示错误(在akka.util.Helpers中)

时间:2017-03-21 07:02:42

标签: apache-spark spark-jobserver

我已使用Cloudera(版本5.8.2)在运行YARN群集的远程计算机上部署了spark job server(version 0.6.2)。按照here给出的说明进行操作。部署之后,当我尝试启动服务器时,出现以下错误:

  

线程“main”中的异常java.lang.NoSuchMethodError:   akka.util.Helpers $ .ConfigOps(LCOM /类型安全/配置/配置)LCOM /类型安全/配置/配置;     at akka.cluster.ClusterSettings。(ClusterSettings.scala:28)at   akka.cluster.Cluster。(Cluster.scala:67)at   akka.cluster.Cluster $ .createExtension(Cluster.scala:42)at   akka.cluster.Cluster $ .createExtension(Cluster.scala:37)at   akka.actor.ActorSystemImpl.registerExtension(ActorSystem.scala:654)     at akka.actor.ExtensionId $ class.apply(Extension.scala:79)at   akka.cluster.Cluster $ .apply(Cluster.scala:37)at   akka.cluster.ClusterActorRefProvider.createRemoteWatcher(ClusterActorRefProvider.scala:66)     在   akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:186)     在   akka.cluster.ClusterActorRefProvider.init(ClusterActorRefProvider.scala:58)     在   akka.actor.ActorSystemImpl._start $ lzycompute(ActorSystem.scala:579)     在akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)at   akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)at   akka.actor.ActorSystem $ .apply(ActorSystem.scala:111)at   akka.actor.ActorSystem $ .apply(ActorSystem.scala:104)at   spark.jobserver.JobServer $ $ .spark $ jobserver $$ JobServer makeSupervisorSystem $ 1(JobServer.scala:128)     在   spark.jobserver.JobServer $$ anonfun $主$ 1.适用(JobServer.scala:130)     在   spark.jobserver.JobServer $$ anonfun $主$ 1.适用(JobServer.scala:130)     在spark.jobserver.JobServer $ .start(JobServer.scala:54)at   spark.jobserver.JobServer $ .main(JobServer.scala:130)at   spark.jobserver.JobServer.main(JobServer.scala)at   sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.lang.reflect.Method.invoke(Method.java:606)at   org.apache.spark.deploy.SparkSubmit $ .ORG $阿帕奇$火花$部署$ SparkSubmit $$ runMain(SparkSubmit.scala:731)     在   org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:181)     在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:206)     在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:121)     在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

运行start_server.sh时服务器中的文件:

的local.conf

# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
spark {
  # spark.master will be passed to each job's JobContext
  # master = "local[4]"
  # master = "mesos://vm28-hulk-pub:5050"
  master = "yarn-client"

  # Default # of CPUs for jobs to use for Spark standalone cluster
  job-number-cpus = 4

  jobserver {
    port = 8090
    jar-store-rootdir = /tmp/jobserver/jars

    context-per-jvm = true

    jobdao = spark.jobserver.io.JobFileDAO

    filedao {
      rootdir = /tmp/spark-job-server/filedao/data
    }

    # When using chunked transfer encoding with scala Stream job results, this is the size of each chunk
    result-chunk-size = 1m
  }

  # predefined Spark contexts
  # contexts {
  #   my-low-latency-context {
  #     num-cpu-cores = 1           # Number of cores to allocate.  Required.
  #     memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, 1G, etc.
  #   }
  #   # define additional contexts here
  # }

  # universal context configuration.  These settings can be overridden, see README.md
  context-settings {
    num-cpu-cores = 2           # Number of cores to allocate.  Required.
    memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, #1G, etc.

    # in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
    # spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"

    # uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
    # dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]

    # If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
    # such as hadoop connection settings that don't use the "spark." prefix
    passthrough {
      #es.nodes = "192.1.1.1"
    }
  }

  # This needs to match SPARK_HOME for cluster SparkContexts to be created successfully
  home = "/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark"
}

# Note that you can use this file to define settings not only for job server,
# but for your Spark jobs as well.  Spark job configuration merges with this configuration file as defaults.

akka {
  remote.netty.tcp {
    # This controls the maximum message size, including job results, that can be sent
    # maximum-frame-size = 10 MiB
  }
}

start_server.sh

#!/bin/bash
# Script to start the job server
# Extra arguments will be spark-submit options, for example
#  ./server_start.sh --jars cassandra-spark-connector.jar
#
# Environment vars (note settings.sh overrides):
#   JOBSERVER_MEMORY - defaults to 1G, the amount of memory (eg 512m, 2G) to give to job server
#   JOBSERVER_CONFIG - alternate configuration file to use
#   JOBSERVER_FG    - launches job server in foreground; defaults to forking in background
echo 'Starting job server...'
set -e

get_abs_script_path() {
  pushd . >/dev/null
  cd "$(dirname "$0")"
  appdir=$(pwd)
  popd  >/dev/null
}

get_abs_script_path

. $appdir/setenv.sh

GC_OPTS="-XX:+UseConcMarkSweepGC
         -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out
         -XX:MaxPermSize=512m
         -XX:+CMSClassUnloadingEnabled "

# To truly enable JMX in AWS and other containerized environments, also need to set
# -Djava.rmi.server.hostname equal to the hostname in that environment.  This is specific
# depending on AWS vs GCE etc.
JAVA_OPTS="-XX:MaxDirectMemorySize=$MAX_DIRECT_MEMORY \
           -XX:+HeapDumpOnOutOfMemoryError -Djava.net.preferIPv4Stack=true"
#           -Dcom.sun.management.jmxremote.port=9999 \
#           -Dcom.sun.management.jmxremote.rmi.port=9999 \
#           -Dcom.sun.management.jmxremote.authenticate=false \
#           -Dcom.sun.management.jmxremote.ssl=false"

MAIN="spark.jobserver.JobServer"

PIDFILE=$appdir/spark-jobserver.pid
if [ -f "$PIDFILE" ] && kill -0 $(cat "$PIDFILE"); then
   echo 'Job server is already running'
   exit 1
fi

# Code added
echo App dir: $appdir
echo Conf file_path: $conffile
echo Spark home: $SPARK_HOME
echo Main class path: $MAIN


cmd='$SPARK_HOME/bin/spark-submit --class $MAIN --driver-memory $JOBSERVER_MEMORY
  --conf "spark.executor.extraJavaOptions=$LOGGING_OPTS"
  --driver-java-options "$GC_OPTS $JAVA_OPTS $LOGGING_OPTS $CONFIG_OVERRIDES"
  $@ $appdir/spark-job-server.jar $conffile'

# Code added
if [ -z "$JOBSERVER_FG" ]; then
  eval $cmd &
  echo $! > $PIDFILE 
else
  eval $cmd
fi

部署前本地计算机中的文件:

的local.conf

# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
spark {
  # spark.master will be passed to each job's JobContext
  # master = "local[4]"
  # master = "mesos://vm28-hulk-pub:5050"
  master = "yarn-client"

  # Default # of CPUs for jobs to use for Spark standalone cluster
  job-number-cpus = 4

  jobserver {
    port = 8090
    jar-store-rootdir = /tmp/jobserver/jars

    context-per-jvm = false

    jobdao = spark.jobserver.io.JobFileDAO

    filedao {
      rootdir = /tmp/spark-job-server/filedao/data
    }

    # When using chunked transfer encoding with scala Stream job results, this is the size of each chunk
    result-chunk-size = 1m
  }

  # predefined Spark contexts
  # contexts {
  #   my-low-latency-context {
  #     num-cpu-cores = 1           # Number of cores to allocate.  Required.
  #     memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, 1G, etc.
  #   }
  #   # define additional contexts here
  # }

  # universal context configuration.  These settings can be overridden, see README.md
  context-settings {
    num-cpu-cores = 2           # Number of cores to allocate.  Required.
    memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, #1G, etc.

    # in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
    # spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"

    # uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
    # dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]

    # If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
    # such as hadoop connection settings that don't use the "spark." prefix
    passthrough {
      #es.nodes = "192.1.1.1"
    }
  }

  # This needs to match SPARK_HOME for cluster SparkContexts to be created successfully
  home = "/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark"
}

# Note that you can use this file to define settings not only for job server,
# but for your Spark jobs as well.  Spark job configuration merges with this configuration file as defaults.

akka {
  remote.netty.tcp {
    # This controls the maximum message size, including job results, that can be sent
    # maximum-frame-size = 10 MiB
  }
}

local.sh

# Environment and deploy file
# For use with bin/server_deploy, bin/server_package etc.
DEPLOY_HOSTS="xx.xx.xxx.xxx"

APP_USER=ubuntu
APP_GROUP=ubuntu
# optional SSH Key to login to deploy server
SSH_KEY=/home/xx/xx/xx.pem
INSTALL_DIR=/home/ubuntu/spark/deployed-job-server
LOG_DIR=/var/log/deployed-job-server
PIDFILE=spark-jobserver.pid
JOBSERVER_MEMORY=1G
SPARK_VERSION=1.6.0
MAX_DIRECT_MEMORY=512M
SPARK_HOME=/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark
SPARK_CONF_DIR=$SPARK_HOME/conf
# Only needed for Mesos deploys
SPARK_EXECUTOR_URI=/home/spark/spark-1.6.0.tar.gz
# Only needed for YARN running outside of the cluster
# You will need to COPY these files from your cluster to the remote machine
# Normally these are kept on the cluster in /etc/hadoop/conf
# YARN_CONF_DIR=/pathToRemoteConf/conf
# HADOOP_CONF_DIR=/pathToRemoteConf/conf
#
# Also optional: extra JVM args for spark-submit
# export SPARK_SUBMIT_OPTS+="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5433"
SCALA_VERSION=2.10.4 # or 2.11.6

这个错误背后的原因是什么,以及如何消除它?

0 个答案:

没有答案