Running S3DistCp command through spark scala using scala.sys.process API

时间:2019-05-31 12:22:40

标签: scala amazon-web-services apache-spark amazon-s3 process

I am running the following code in spark scala that copies a some files from hadoop cluster to aws s3 bucket:

import scala.sys.process._

val s3DistCpCmd = "s3-dist-cp --src hdfs:///user/hadoop/folder1/ --dest s3://bucket/folder2"

val outputs = stringToProcess(s3DistCpCmd).!! // it hangs here

logger.info("outputs: [{}]", outputs)

but the code simply hangs at 3rd statement while executing the command

when I run the "s3-dist-cp --src hdfs:///user/hadoop/folder1/" --dest s3://bucket/folder2" explicitly from that node where the spark driver is running it is working fine, but when I run as part of scala sys process, it just hangs, does not produce anything, simply hangs, no logs, no error logs, nothing

Am I missing anything in the above?, would appreciate your help

0 个答案:

没有答案