如何在集群模式下使用Scala的Process API执行外部命令?

时间:2017-05-12 10:26:08

标签: scala hadoop apache-spark hdfs spark-submit

我想在Spark应用程序中使用Scala的Process API运行外部命令。

s"hdfs dfs -cat $folderName$fileName"

当我spark-submit到群集时,应用程序正常工作,但--deploy-mode cluster失败。为什么呢?

我收到错误:

17/05/22 08:20:35 INFO yarn.YarnRMClient: Registering the ApplicationMaster
log4j:ERROR Could not read configuration file from URL [file:/var/run/cloudera-scm-agent/process/975-yarn-NODEMANAGER/log4j.properties].
java.io.FileNotFoundException: /var/run/cloudera-scm-agent/process/975-yarn-NODEMANAGER/log4j.properties (Permission denied)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
        at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
        at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
        at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
        at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
        at org.apache.log4j.Logger.getLogger(Logger.java:104)
        at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
        at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
        at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
        at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
        at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
        at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
        at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
        at org.apache.hadoop.fs.FsShell.<clinit>(FsShell.java:47)
log4j:ERROR Ignoring configuration file [file:/var/run/cloudera-scm-agent/process/975-yarn-NODEMANAGER/log4j.properties].
17/05/22 08:20:35 INFO util.Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
17/05/22 08:20:35 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.io.IOException: Pipe closed
        at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
        at java.io.PipedInputStream.awaitSpace(PipedInputStream.java:268)
        at java.io.PipedInputStream.receive(PipedInputStream.java:231)
        at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
        at scala.sys.process.BasicIO$.loop$1(BasicIO.scala:236)
        at scala.sys.process.BasicIO$.transferFullyImpl(BasicIO.scala:242)
        at scala.sys.process.BasicIO$.transferFully(BasicIO.scala:223)
        at scala.sys.process.ProcessImpl$PipeThread.runloop(ProcessImpl.scala:159)
        at scala.sys.process.ProcessImpl$PipeSource.run(ProcessImpl.scala:179)
17/05/22 08:20:38 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.RuntimeException: Nonzero exit value: 2
java.lang.RuntimeException: Nonzero exit value: 2
        at scala.sys.package$.error(package.scala:27)
        at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:132)
        at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:102)
        at ExternalCommandsTry$.main(ExternalCommandsTry.scala:10)
        at ExternalCommandsTry.main(ExternalCommandsTry.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:646)
17/05/22 08:20:38 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: Nonzero exit value: 2)
17/05/22 08:20:38 INFO spark.SparkContext: Invoking stop() from shutdown hook
17/05/22 08:20:38 INFO server.ServerConnector: Stopped ServerConnector@50eaca3{HTTP/1.1}{0.0.0.0:0}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@29700de3{/stages/stage/kill,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@22c7c413{/jobs/job/kill,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3e028911{/api,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@566d1ce8{/,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3af63959{/static,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4031b51{/executors/threadDump/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7fc2f8ac{/executors/threadDump,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1cde053a{/executors/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@39c0e161{/executors,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3dc8ce53{/environment/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5c77f11c{/environment,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@71ba803b{/storage/rdd/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4ee95caf{/storage/rdd,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1c4fb22b{/storage/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4adce78f{/storage,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@26a339f0{/stages/pool/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@373cb914{/stages/pool,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@38ab9f6{/stages/stage/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@e5c59e2{/stages/stage,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@430a409b{/stages/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@20529a01{/stages,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@49c89265{/jobs/job/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@31cf8a82{/jobs/job,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@65d0dd65{/jobs/json,null,UNAVAILABLE}
17/05/22 08:20:38 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@59261abf{/jobs,null,UNAVAILABLE}
17/05/22 08:20:38 INFO ui.SparkUI: Stopped Spark web UI at http://10.0.68.139:53340
17/05/22 08:20:38 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
17/05/22 08:20:38 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
17/05/22 08:20:38 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
 services=List(),
 started=false)
17/05/22 08:20:38 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/05/22 08:20:38 INFO memory.MemoryStore: MemoryStore cleared
17/05/22 08:20:38 INFO storage.BlockManager: BlockManager stopped
17/05/22 08:20:38 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
17/05/22 08:20:38 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/05/22 08:20:38 INFO spark.SparkContext: Successfully stopped SparkContext
17/05/22 08:20:38 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.lang.RuntimeException: Nonzero exit value: 2)
17/05/22 08:20:38 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
17/05/22 08:20:38 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://nameservice1/user/roswal01/.sparkStaging/application_1493995278161_1838
17/05/22 08:20:38 INFO util.ShutdownHookManager: Shutdown hook called
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/6/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-0ca2fcde-b81c-4a61-9c9c-2d91949d8df7
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/2/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-535b60d9-3dad-4ec7-9d18-f0783724b8d4
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/5/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-ee18a0d2-6681-471d-949c-5c2b76855b8d
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/3/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-a39456d7-3186-4d52-9090-a790ebb5e54e
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/4/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-c6cb6932-c59a-48b8-912d-ae10215ea372
17/05/22 08:20:38 INFO util.ShutdownHookManager: Deleting directory /mnt/hdfs/1/yarn/nm/usercache/roswal01/appcache/application_1493995278161_1838/spark-35543c15-0076-4d28-be40-ea777745cac2

LogType:stdout
Log Upload Time:Mon May 22 08:20:40 -0400 2017
LogLength:98
Log Contents:
I/O error Pipe closed for process: [hadoop, fs, -cat, /data/test/zipfiletest/pgp_sample_file.PGP]

要执行的整个命令序列如下:

val result1 = (s"hdfs dfs -cat $folderName$fileName" #| s"gpg --batch --no-tty --yes --passphrase abcdefgh -d -o /tmp/$paraname")!
val result2 = (s"hadoop fs -put /tmp/$paraname $outputpathwithoutNameService")!
val result3 = (s"rm /tmp/$paraname")!

hdfs dfs -cat /data/test/zipfiletest/pgp_sample_file.PGPgpg单独工作正常,问题就在于介于两者之间的管道。

0 个答案:

没有答案