Question

我已使用Cloudera（版本5.8.2）在运行YARN群集的远程计算机上部署了spark job server（version 0.6.2）。按照here给出的说明进行操作。部署之后，当我尝试启动服务器时，出现以下错误：

线程“main”中的异常java.lang.NoSuchMethodError： akka.util.Helpers $ .ConfigOps（LCOM /类型安全/配置/配置）LCOM /类型安全/配置/配置; at akka.cluster.ClusterSettings。（ClusterSettings.scala：28）at akka.cluster.Cluster。（Cluster.scala：67）at akka.cluster.Cluster $ .createExtension（Cluster.scala：42）at akka.cluster.Cluster $ .createExtension（Cluster.scala：37）at akka.actor.ActorSystemImpl.registerExtension（ActorSystem.scala：654） at akka.actor.ExtensionId $ class.apply（Extension.scala：79）at akka.cluster.Cluster $ .apply（Cluster.scala：37）at akka.cluster.ClusterActorRefProvider.createRemoteWatcher（ClusterActorRefProvider.scala：66）在 akka.remote.RemoteActorRefProvider.init（RemoteActorRefProvider.scala：186）在 akka.cluster.ClusterActorRefProvider.init（ClusterActorRefProvider.scala：58）在 akka.actor.ActorSystemImpl._start $ lzycompute（ActorSystem.scala：579）在akka.actor.ActorSystemImpl._start（ActorSystem.scala：577）at akka.actor.ActorSystemImpl.start（ActorSystem.scala：588）at akka.actor.ActorSystem $ .apply（ActorSystem.scala：111）at akka.actor.ActorSystem $ .apply（ActorSystem.scala：104）at spark.jobserver.JobServer $ $ .spark $ jobserver $$ JobServer makeSupervisorSystem $ 1（JobServer.scala：128）在 spark.jobserver.JobServer $$ anonfun $主$ 1.适用（JobServer.scala：130）在 spark.jobserver.JobServer $$ anonfun $主$ 1.适用（JobServer.scala：130）在spark.jobserver.JobServer $ .start（JobServer.scala：54）at spark.jobserver.JobServer $ .main（JobServer.scala：130）at spark.jobserver.JobServer.main（JobServer.scala）at sun.reflect.NativeMethodAccessorImpl.invoke0（Native Method）at sun.reflect.NativeMethodAccessorImpl.invoke（NativeMethodAccessorImpl.java:57）在 sun.reflect.DelegatingMethodAccessorImpl.invoke（DelegatingMethodAccessorImpl.java:43）在java.lang.reflect.Method.invoke（Method.java:606）at org.apache.spark.deploy.SparkSubmit $ .ORG $阿帕奇$火花$部署$ SparkSubmit $$ runMain（SparkSubmit.scala：731）在 org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1（SparkSubmit.scala：181）在org.apache.spark.deploy.SparkSubmit $ .submit（SparkSubmit.scala：206）在org.apache.spark.deploy.SparkSubmit $ .main（SparkSubmit.scala：121）在org.apache.spark.deploy.SparkSubmit.main（SparkSubmit.scala）

运行start_server.sh时服务器中的文件：

的local.conf ：

# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
spark {
  # spark.master will be passed to each job's JobContext
  # master = "local[4]"
  # master = "mesos://vm28-hulk-pub:5050"
  master = "yarn-client"

  # Default # of CPUs for jobs to use for Spark standalone cluster
  job-number-cpus = 4

  jobserver {
    port = 8090
    jar-store-rootdir = /tmp/jobserver/jars

    context-per-jvm = true

    jobdao = spark.jobserver.io.JobFileDAO

    filedao {
      rootdir = /tmp/spark-job-server/filedao/data
    }

    # When using chunked transfer encoding with scala Stream job results, this is the size of each chunk
    result-chunk-size = 1m
  }

  # predefined Spark contexts
  # contexts {
  #   my-low-latency-context {
  #     num-cpu-cores = 1           # Number of cores to allocate.  Required.
  #     memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, 1G, etc.
  #   }
  #   # define additional contexts here
  # }

  # universal context configuration.  These settings can be overridden, see README.md
  context-settings {
    num-cpu-cores = 2           # Number of cores to allocate.  Required.
    memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, #1G, etc.

    # in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
    # spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"

    # uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
    # dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]

    # If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
    # such as hadoop connection settings that don't use the "spark." prefix
    passthrough {
      #es.nodes = "192.1.1.1"
    }
  }

  # This needs to match SPARK_HOME for cluster SparkContexts to be created successfully
  home = "/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark"
}

# Note that you can use this file to define settings not only for job server,
# but for your Spark jobs as well.  Spark job configuration merges with this configuration file as defaults.

akka {
  remote.netty.tcp {
    # This controls the maximum message size, including job results, that can be sent
    # maximum-frame-size = 10 MiB
  }
}

start_server.sh ：

#!/bin/bash
# Script to start the job server
# Extra arguments will be spark-submit options, for example
#  ./server_start.sh --jars cassandra-spark-connector.jar
#
# Environment vars (note settings.sh overrides):
#   JOBSERVER_MEMORY - defaults to 1G, the amount of memory (eg 512m, 2G) to give to job server
#   JOBSERVER_CONFIG - alternate configuration file to use
#   JOBSERVER_FG    - launches job server in foreground; defaults to forking in background
echo 'Starting job server...'
set -e

get_abs_script_path() {
  pushd . >/dev/null
  cd "$(dirname "$0")"
  appdir=$(pwd)
  popd  >/dev/null
}

get_abs_script_path

. $appdir/setenv.sh

GC_OPTS="-XX:+UseConcMarkSweepGC
         -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out
         -XX:MaxPermSize=512m
         -XX:+CMSClassUnloadingEnabled "

# To truly enable JMX in AWS and other containerized environments, also need to set
# -Djava.rmi.server.hostname equal to the hostname in that environment.  This is specific
# depending on AWS vs GCE etc.
JAVA_OPTS="-XX:MaxDirectMemorySize=$MAX_DIRECT_MEMORY \
           -XX:+HeapDumpOnOutOfMemoryError -Djava.net.preferIPv4Stack=true"
#           -Dcom.sun.management.jmxremote.port=9999 \
#           -Dcom.sun.management.jmxremote.rmi.port=9999 \
#           -Dcom.sun.management.jmxremote.authenticate=false \
#           -Dcom.sun.management.jmxremote.ssl=false"

MAIN="spark.jobserver.JobServer"

PIDFILE=$appdir/spark-jobserver.pid
if [ -f "$PIDFILE" ] && kill -0 $(cat "$PIDFILE"); then
   echo 'Job server is already running'
   exit 1
fi

# Code added
echo App dir: $appdir
echo Conf file_path: $conffile
echo Spark home: $SPARK_HOME
echo Main class path: $MAIN


cmd='$SPARK_HOME/bin/spark-submit --class $MAIN --driver-memory $JOBSERVER_MEMORY
  --conf "spark.executor.extraJavaOptions=$LOGGING_OPTS"
  --driver-java-options "$GC_OPTS $JAVA_OPTS $LOGGING_OPTS $CONFIG_OVERRIDES"
  $@ $appdir/spark-job-server.jar $conffile'

# Code added
if [ -z "$JOBSERVER_FG" ]; then
  eval $cmd &
  echo $! > $PIDFILE 
else
  eval $cmd
fi

部署前本地计算机中的文件：

的local.conf ：

# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
spark {
  # spark.master will be passed to each job's JobContext
  # master = "local[4]"
  # master = "mesos://vm28-hulk-pub:5050"
  master = "yarn-client"

  # Default # of CPUs for jobs to use for Spark standalone cluster
  job-number-cpus = 4

  jobserver {
    port = 8090
    jar-store-rootdir = /tmp/jobserver/jars

    context-per-jvm = false

    jobdao = spark.jobserver.io.JobFileDAO

    filedao {
      rootdir = /tmp/spark-job-server/filedao/data
    }

    # When using chunked transfer encoding with scala Stream job results, this is the size of each chunk
    result-chunk-size = 1m
  }

  # predefined Spark contexts
  # contexts {
  #   my-low-latency-context {
  #     num-cpu-cores = 1           # Number of cores to allocate.  Required.
  #     memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, 1G, etc.
  #   }
  #   # define additional contexts here
  # }

  # universal context configuration.  These settings can be overridden, see README.md
  context-settings {
    num-cpu-cores = 2           # Number of cores to allocate.  Required.
    memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, #1G, etc.

    # in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
    # spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"

    # uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
    # dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]

    # If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
    # such as hadoop connection settings that don't use the "spark." prefix
    passthrough {
      #es.nodes = "192.1.1.1"
    }
  }

  # This needs to match SPARK_HOME for cluster SparkContexts to be created successfully
  home = "/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark"
}

# Note that you can use this file to define settings not only for job server,
# but for your Spark jobs as well.  Spark job configuration merges with this configuration file as defaults.

akka {
  remote.netty.tcp {
    # This controls the maximum message size, including job results, that can be sent
    # maximum-frame-size = 10 MiB
  }
}

local.sh ：

# Environment and deploy file
# For use with bin/server_deploy, bin/server_package etc.
DEPLOY_HOSTS="xx.xx.xxx.xxx"

APP_USER=ubuntu
APP_GROUP=ubuntu
# optional SSH Key to login to deploy server
SSH_KEY=/home/xx/xx/xx.pem
INSTALL_DIR=/home/ubuntu/spark/deployed-job-server
LOG_DIR=/var/log/deployed-job-server
PIDFILE=spark-jobserver.pid
JOBSERVER_MEMORY=1G
SPARK_VERSION=1.6.0
MAX_DIRECT_MEMORY=512M
SPARK_HOME=/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/spark
SPARK_CONF_DIR=$SPARK_HOME/conf
# Only needed for Mesos deploys
SPARK_EXECUTOR_URI=/home/spark/spark-1.6.0.tar.gz
# Only needed for YARN running outside of the cluster
# You will need to COPY these files from your cluster to the remote machine
# Normally these are kept on the cluster in /etc/hadoop/conf
# YARN_CONF_DIR=/pathToRemoteConf/conf
# HADOOP_CONF_DIR=/pathToRemoteConf/conf
#
# Also optional: extra JVM args for spark-submit
# export SPARK_SUBMIT_OPTS+="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5433"
SCALA_VERSION=2.10.4 # or 2.11.6

这个错误背后的原因是什么，以及如何消除它？

Spark作业服务器启动但显示错误（在akka.util.Helpers中）

0 个答案: