我正在尝试在我的EC2集群上运行大数据benchmark,用于我自己的位于here的Spark分支。它只是修改了Spark核心上的一些文件。我的群集包含1个主节点和2个类型为m1.large的从节点。我使用与Spark捆绑的ec2脚本来启动我的集群。集群发布完美,我能够成功地进入主服务器。但是,当我尝试使用命令
从主服务器运行基准测试时./runner/prepare-benchmark.sh --shark --aws-key-id=xxxxxxxx --aws-key=xxxxxxxx --shark-host=<my-spark-master> --shark-identity-file=/root/.ssh/id_rsa --scale-factor=1
我收到以下错误:
=== IMPORTING BENCHMARK DATA FROM S3 ===
bash: /root/ephemeral-hdfs/bin/hdfs: No such file or directory
Connection to ec2-54-201-169-165.us-west-2.compute.amazonaws.com closed.
bash: /root/mapreduce/bin/start-mapred.sh: No such file or directory
Connection to ec2-54-201-169-165.us-west-2.compute.amazonaws.com closed.
Traceback (most recent call last):
File "./prepare_benchmark.py", line 606, in <module>
main()
File "./prepare_benchmark.py", line 594, in main
prepare_shark_dataset(opts)
File "./prepare_benchmark.py", line 192, in prepare_shark_dataset
ssh_shark("/root/mapreduce/bin/start-mapred.sh")
File "./prepare_benchmark.py", line 180, in ssh_shark
ssh(opts.shark_host, "root", opts.shark_identity_file, command)
File "./prepare_benchmark.py", line 139, in ssh
(identity_file, username, host, command), shell=True)
File "/usr/lib64/python2.6/subprocess.py", line 505, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'ssh -t -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa root@ec2-54-201-169-165.us-west-2.compute.amazonaws.com 'source /root/.bash_profile;
/root/mapreduce/bin/start-mapred.sh'' returned non-zero exit status 127
我尝试终止群集并再次启动它多次,但问题仍然存在。可能是什么问题?