无法访问Spark Driver UI - HTTP ERROR 500

时间:2017-06-05 17:43:26

标签: apache-spark

我正在尝试访问在我的边缘节点的端口4040上运行的Spark Driver UI(在客户端模式下运行),但是我收到了以下错误。

HTTP ERROR 500
    javax.servlet.ServletException: Could not determine the proxy server for redirection
    at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:195)
    at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:141)
    at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
    at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
    at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
    at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
    at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.spark_project.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:479)
    at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
    at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.spark_project.jetty.server.Server.handle(Server.java:499)
    at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:311)
    at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
    at org.spark_project.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
    at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
    at org.spark_project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
    at java.lang.Thread.run(Thread.java:745)

为了打到边缘节点,我需要首先将VPN连接到本地网络。 (例如,要点击边缘节点CLI,我必须vpn然后SSH进入边缘节点)。我试过转发端口并以这种方式访问​​它,但无济于事。有没有人遇到过类似的访问错误?我应该注意到,我知道边缘的内部和外部IP。

其他信息: Spark版本是2.1.0。在Cloudera集群上运行,因此必须使用spark2-submit:

spark2-submit --master yarn --jars /home/hail-all-spark.jar --py-files /home/pyhail.zip --conf spark.driver.extraClassPath=./hail-all-spark.jar --conf=spark.executor.extraClassPath=./hail-all-spark.jar /home/hail_work/impala/vcf_to_impala_vds.py

详细输出:

Using properties file: /opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/conf/spark-defaults.conf
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.yarn.jars=local:/opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/jars/*
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.hadoop.mapreduce.application.classpath=
Adding default property: spark.shuffle.service.enabled=true
Adding default property: spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
Adding default property: spark.yarn.appMasterEnv.PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda-4.1.1/bin/python
Adding default property: spark.yarn.historyServer.address=http://master-9261ce5a.<>.local:18089
Adding default property: spark.ui.killEnabled=true
Adding default property: spark.sql.hive.metastore.jars=${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*
Adding default property: spark.dynamicAllocation.schedulerBacklogTimeout=1
Adding default property: spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
Adding default property: spark.network.sasl.serverAlwaysEncrypt=true
Adding default property: spark.yarn.config.gatewayPath=/opt/cloudera/parcels
Adding default property: spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
Adding default property: spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=/opt/cloudera/parcels/Anaconda-4.1.1/bin/python
Adding default property: spark.submit.deployMode=client
Adding default property: spark.shuffle.service.port=7337
Adding default property: spark.master=yarn
Adding default property: spark.authenticate.enableSaslEncryption=true
Adding default property: spark.authenticate=true
Adding default property: spark.acls.enable=true
Adding default property: spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
Adding default property: spark.eventLog.dir=hdfs://rushdatascience/user/spark/spark2ApplicationHistory
Adding default property: spark.dynamicAllocation.enabled=true
Adding default property: spark.sql.catalogImplementation=hive
Adding default property: spark.hadoop.yarn.application.classpath=
Adding default property: spark.shuffle.encryption.enabled=true
Adding default property: spark.dynamicAllocation.minExecutors=0
Adding default property: spark.shuffle.encryption.keygen.algorithm=HmacSHA256
Adding default property: spark.shuffle.crypto.cipher.transformation=AES/CTR/NoPadding
Adding default property: spark.dynamicAllocation.executorIdleTimeout=60
Adding default property: spark.shuffle.encryption.keySizeBits=256
Adding default property: spark.sql.hive.metastore.version=1.1.0
Parsed arguments:
  master                  yarn
  deployMode              client
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          /opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/conf/spark-defaults.conf
  driverMemory            null
  driverCores             null
  driverExtraClassPath    ./hail-all-spark.jar
  driverExtraLibraryPath  /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 file:/home/<>/pyhail.zip
  archives                null
  mainClass               null
  primaryResource         file:/home/<>/hail_work/ml_on_vds.py
  name                    ml_on_vds.py
  childArgs               []
  jars                    file:/home/<>/hail-all-spark.jar
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/conf/spark-defaults.conf:
  spark.network.sasl.serverAlwaysEncrypt -> true
  spark.executor.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
  spark.authenticate -> true
  spark.yarn.jars -> local:/opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/jars/*
  spark.driver.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
  spark.shuffle.encryption.keygen.algorithm -> HmacSHA256
  spark.yarn.historyServer.address -> http://master-9261ce5a.<>.local:18089
  spark.yarn.am.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
  spark.eventLog.enabled -> true
  spark.acls.enable -> true
  spark.dynamicAllocation.schedulerBacklogTimeout -> 1
  spark.shuffle.crypto.cipher.transformation -> AES/CTR/NoPadding
  spark.yarn.config.gatewayPath -> /opt/cloudera/parcels
  spark.authenticate.enableSaslEncryption -> true
  spark.ui.killEnabled -> true
  spark.serializer -> org.apache.spark.serializer.KryoSerializer
  spark.shuffle.encryption.keySizeBits -> 256
  spark.dynamicAllocation.executorIdleTimeout -> 60
  spark.dynamicAllocation.minExecutors -> 0
  spark.hadoop.yarn.application.classpath ->
  spark.shuffle.service.enabled -> true
  spark.yarn.config.replacementPath -> {{HADOOP_COMMON_HOME}}/../../..
  spark.sql.hive.metastore.version -> 1.1.0
  spark.submit.deployMode -> client
  spark.shuffle.service.port -> 7337
  spark.hadoop.mapreduce.application.classpath ->
  spark.shuffle.encryption.enabled -> true
  spark.executor.extraClassPath -> ./hail-all-spark.jar
  spark.eventLog.dir -> hdfs://rushdatascience/user/spark/spark2ApplicationHistory
  spark.master -> yarn
  spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON -> /opt/cloudera/parcels/Anaconda-4.1.1/bin/python
  spark.dynamicAllocation.enabled -> true
  spark.sql.catalogImplementation -> hive
  spark.yarn.appMasterEnv.PYSPARK_PYTHON -> /opt/cloudera/parcels/Anaconda-4.1.1/bin/python
  spark.sql.hive.metastore.jars -> ${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*
  spark.driver.extraClassPath -> ./hail-all-spark.jar


Main class:
org.apache.spark.deploy.PythonRunner
Arguments:
file:/home/<>/hail_work/ml_on_vds.py
file:/home/<>/pyhail.zip
System properties:
spark.network.sasl.serverAlwaysEncrypt -> true
spark.executor.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
spark.yarn.dist.jars -> file:/home/<>/hail-all-spark.jar
spark.authenticate -> true
spark.yarn.jars -> local:/opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/jars/*
spark.driver.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
spark.shuffle.encryption.keygen.algorithm -> HmacSHA256
spark.yarn.historyServer.address -> http://master-9261ce5a.<>.local:18089
spark.yarn.am.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/lib/native
spark.eventLog.enabled -> true
spark.acls.enable -> true
spark.dynamicAllocation.schedulerBacklogTimeout -> 1
spark.shuffle.crypto.cipher.transformation -> AES/CTR/NoPadding
SPARK_SUBMIT -> true
spark.submit.pyFiles -> /home/<>/pyhail.zip
spark.yarn.config.gatewayPath -> /opt/cloudera/parcels
spark.authenticate.enableSaslEncryption -> true
spark.ui.killEnabled -> true
spark.serializer -> org.apache.spark.serializer.KryoSerializer
spark.app.name -> ml_on_vds.py
spark.shuffle.service.enabled -> true
spark.hadoop.yarn.application.classpath ->
spark.dynamicAllocation.minExecutors -> 0
spark.dynamicAllocation.executorIdleTimeout -> 60
spark.shuffle.encryption.keySizeBits -> 256
spark.yarn.config.replacementPath -> {{HADOOP_COMMON_HOME}}/../../..
spark.sql.hive.metastore.version -> 1.1.0
spark.submit.deployMode -> client
spark.shuffle.service.port -> 7337
spark.executor.extraClassPath -> ./hail-all-spark.jar
spark.shuffle.encryption.enabled -> true
spark.hadoop.mapreduce.application.classpath ->
spark.eventLog.dir -> hdfs://rushdatascience/user/spark/spark2ApplicationHistory
spark.yarn.isPython -> true
spark.master -> yarn
spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON -> /opt/cloudera/parcels/Anaconda-4.1.1/bin/python
spark.dynamicAllocation.enabled -> true
spark.sql.catalogImplementation -> hive
spark.sql.hive.metastore.jars -> ${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*
spark.yarn.appMasterEnv.PYSPARK_PYTHON -> /opt/cloudera/parcels/Anaconda-4.1.1/bin/python
spark.driver.extraClassPath -> ./hail-all-spark.jar
Classpath elements:
file:/home/<>/hail-all-spark.jar

0 个答案:

没有答案