升级火花版| Cloudera的

时间:2016-12-06 10:58:45

标签: apache-spark cloudera cloudera-manager

我使用cdh5.9.0进行群集设置。 cloudera发布的默认Spark服务包是1.6.0。 我需要将相同的升级到1.6.3,因为分布式缓存问题已在以下git提交中解决:https://github.com/RicoGit/spark/commit/e5f1d9c8f9c94615322aaf7508e753307f553d53

如果我能够了解升级cloudera上部署的spark服务的简洁方法。 此外,在此扩展中,如何升级到Spark 2.0以及同一群集。

谢谢。

2 个答案:

答案 0 :(得分:2)

最近Cloudera发布了Spark 2.0 parcels,您可以从spark archive

下载

按照link进行安装步骤

注意: Apache Spark 2.0只能安装在CDH 5.7,CDH 5.8或CDH 5.9群集上,并且要求最低CM版本为5.8.3,5.9或更高

答案 1 :(得分:2)

只需执行以下步骤:

https://gist.github.com/shredder47/ce2f158a2a3907c0d264c5e9e4aab2fa

java -version
sudo yum remove java
sudo yum install java-1.8.0-openjdk
source ~/.bash_profile

Download Spark 2.4.7 With Hadoop 2.6 (Tar)
Extract contents.
Move the contents of the folder to :

/usr/local/spark

Now,
Open:

/usr/bin/pyspark
/usr/bin/spark-shell
/usr/bin/spark-submit


and change the value for each files to 

'exec /usr/local/spark/bin/pyspark "$@"'
'exec /usr/local/spark/bin/spark-shell "$@"'
'exec /usr/local/spark/bin/spark-submit "$@"'

Now try running spark to check the version