如何在Mac上安装带有自制程序的apache-spark 2.2.0

时间:2018-04-13 03:39:17

标签: apache-spark homebrew

" $ brew install apache-spark' 给我2.3.x版。 ' $ brew搜索apache-spark' 和 ' $ brew info apache-spark' 不提供安装其他版本的选项。 是否可以使用自制软件获得不同的版本?

5 个答案:

答案 0 :(得分:14)

运行这些命令(假设您已经通过Homebrew安装了apache-spark)

cd "$(brew --repo homebrew/core)"
git log Formula/apache-spark.rb

EG。 2.2.0版本:

  

...

     

commit bdf68bd79ebd16a70b7a747e027afbe5831f9cc3

     

作者:ilovezfs

     

日期:星期二7月11日22:19:12 2017 -0700

     

apache-spark 2.2.0(#15507)

     

...

git checkout -b  apache-spark-2.2.0 bdf68bd79ebd16a70b7a747e027afbe5831f9cc3
brew unlink apache-spark
HOMEBREW_NO_AUTO_UPDATE=1 brew install apache-spark

清理

git checkout master
git branch -d apache-spark-2.2.0 

检查/切换:

brew list apache-spark --versions
brew switch apache-spark 2.2.0

答案 1 :(得分:4)

我有同样的问题,当我通过自制软件安装时,默认情况下它只能找到apache-spark 2.3.0公式,甚至在删除的存储库中找不到2.2.0。

所以,我已经从路径备份了现有的apache-spark.rb(版本2.3.0):/ usr / local / Homebrew / Library / Taps / homebrew / homebrew-core / Formula然后用下面的方法覆盖:

class ApacheSpark < Formula
  desc "Engine for large-scale data processing"
  homepage "https://spark.apache.org/"
  url "https://www.apache.org/dyn/closer.lua?path=spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz"
  version "2.2.0"
  sha256 "97fd2cc58e08975d9c4e4ffa8d7f8012c0ac2792bcd9945ce2a561cf937aebcc"
  head "https://github.com/apache/spark.git"

  bottle :unneeded

  def install
    # Rename beeline to distinguish it from hive's beeline
    mv "bin/beeline", "bin/spark-beeline"

    rm_f Dir["bin/*.cmd"]
    libexec.install Dir["*"]
    bin.write_exec_script Dir["#{libexec}/bin/*"]
  end

  test do
    assert_match "Long = 1000", pipe_output(bin/"spark-shell", "sc.parallelize(1 to 1000).count()")
  end
end

然后按照上面的过程重新安装我有2.2.0和2.3.0的交换设备。

希望它有所帮助。

答案 2 :(得分:2)

对于后代:尝试恢复较旧的brew提交没有任何意义,因为公式(https://www.apache.org/dyn/closer.lua?path=spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz)中的URL不再有效。这也意味着2.2.1的冲泡公式也无法按原样工作。

至少,您需要将网址更新为http://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz(如@juanpaolo所述)。

为了今天通过Homebrew安装Spark 2.2.0,

  1. 获取2.2.0公式(https://github.com/Homebrew/homebrew-core/blob/bdf68bd79ebd16a70b7a747e027afbe5831f9cc3/Formula/apache-spark.rb
  2. 将第4行中的网址从https://www.apache.org/dyn/closer.lua?path=spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz更新为http://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
  3. brew install <path-to-updated-formula>

TLDR /懒惰:

brew install https://gist.githubusercontent.com/eddies/bc148d83b1fc5555520d0cdf2dff8553/raw/c7ce091a083cacb3519502860695b56b0b806070/apache-spark.rb

或者通过酿造水龙头:

brew tap eddies/spark-tap
brew install apache-spark@2.2.0

答案 3 :(得分:2)

我需要在我的 MacBook 上专门安装 Apache Spark 2.4.0 版本。但在 Brew 列表中不再可用,但您仍然可以制作它。

通过 brew install apache-spark 安装最新的 Spark。假设它安装了 apache-spark-3.0.1

完成后执行 brew edit apache-spark 并按如下方式编辑 Pachecos-spark.rb

class ApacheSpark < Formula
  desc "Engine for large-scale data processing"
  homepage "https://spark.apache.org/"
  url "https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz"
  mirror "https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz"
  version "2.4.0"
  sha256 "c93c096c8d64062345b26b34c85127a6848cff95a4bb829333a06b83222a5cfa"
  license "Apache-2.0"
  head "https://github.com/apache/spark.git"

  bottle :unneeded

  depends_on "openjdk@8"

  def install
    # Rename beeline to distinguish it from hive's beeline
    mv "bin/beeline", "bin/spark-beeline"

    rm_f Dir["bin/*.cmd"]
    libexec.install Dir["*"]
    bin.install Dir[libexec/"bin/*"]
    bin.env_script_all_files(libexec/"bin", JAVA_HOME: Formula["openjdk@8"].opt_prefix)
  end

  test do
    assert_match "Long = 1000",
      pipe_output(bin/"spark-shell --conf spark.driver.bindAddress=127.0.0.1",
                  "sc.parallelize(1 to 1000).count()")
  end
end

现在使用 brew uninstall apache-spark 再次卸载 spark 使用 brew install apache-spark

重新安装

结果

% spark-shell
2021-02-09 19:27:11 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.0.17:4040
Spark context available as 'sc' (master = local[*], app id = local-1612927640472).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.0
      /_/
         
Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_282)
Type in expressions to have them evaluated.
Type :help for more information.

答案 4 :(得分:0)

甚至您都可以搜索可用于apache-spark的公式列表

brew search apache-spark

然后点击

酿造水龙头涡流/火花

然后安装可用的特定版本

简单安装apache-spark@2.3.2