我有一个脱机pyspark群集(无法访问互联网),需要安装graphframes库。
我已从$ SPARK_HOME / jars /中添加的here手动下载了jar,然后尝试使用它时出现以下错误:
error: missing or invalid dependency detected while loading class file 'Logging.class'.
Could not access term typesafe in package com,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com.
error: missing or invalid dependency detected while loading class file 'Logging.class'.
Could not access term scalalogging in value com.typesafe,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com.typesafe.
error: missing or invalid dependency detected while loading class file 'Logging.class'.
Could not access type LazyLogging in value com.slf4j,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com.slf4j.
哪种方法可以脱机安装所有依赖项?
答案 0 :(得分:1)
我设法安装了graphframes libarary。首先,我发现了graphframes依赖项,其中:
scala-logging-api_xx-xx.jar
scala-logging-slf4j_xx-xx.jar
其中xx是scala和jar版本的正确版本。然后,我将它们安装在正确的路径中。因为我在Cloudera机器上工作,所以正确的路径是:
/ opt / cloudera / parcels / SPARK2 / lib / spark2 / jars /
如果您不能将它们放置在群集的此目录中(因为您没有超级用户权限,并且您的管理员超级懒惰),则只需在spark-submit / spark-shell中添加
spark-submit ..... --driver-class-path /path-for-jar/ \
--jars /../graphframes-0.5.0-spark2.1-s_2.11.jar,/../scala-logging-slf4j_2.10-2.1.2.jar,/../scala-logging-api_2.10-2.1.2.jar
这适用于Scala。为了将图框用于python,您需要 下载graphframes jar,然后通过外壳
#Extract JAR content
jar xf graphframes_graphframes-0.3.0-spark2.0-s_2.11.jar
#Enter the folder
cd graphframes
#Zip the contents
zip graphframes.zip -r *
然后将压缩文件添加到spark-env.sh或bash_profile中的python路径中
export PYTHONPATH=$PYTHONPATH:/..proper path/graphframes.zip:.
然后打开外壳程序/提交(再次使用与scala相同的参数)导入图框正常
此link对于此解决方案非常有用