Mahout 0.11.1 Spark-Shell NoClassDefFoundError

时间:2016-02-04 01:03:48

标签: java apache-spark jersey mahout

我想让Mahout Spark-Shell在Cloudera QuickStart VM上运行

export MAHOUT_HOME=/home/cloudera/Desktop/Mahout_0_11_1
export MAHOUT_LOCAL=true
export SPARK_HOME=/usr/lib/spark
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera

.bashrc设置为

java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer
    at org.apache.spark.status.api.v1.ApiRootResource$.getServletHandler(ApiRootResource.scala:187)
    at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:68)
    at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:74)
    at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:190)
    at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:141)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:466)
    at org.apache.mahout.sparkbindings.package$.mahoutSparkContext(package.scala:91)
    at org.apache.mahout.sparkbindings.shell.MahoutSparkILoop.createSparkContext(MahoutSparkILoop.scala:89)
...

当我运行Mahout Spark-Shell时,我收到以下错误消息。

Mahout distributed context is available as "implicit val sdc".
java.lang.NullPointerException
    at org.apache.spark.sql.execution.ui.SQLListener.<init>(SQLListener.scala:34)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:77)
    at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1033)
    at $iwC$$iwC.<init>(<console>:11)
    at $iwC.<init>(<console>:19)

然后是:

def viol_jitter(data,xpos = 0, width = 0.8 , s=20, c='b', marker='o', cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, label=None,  **kwargs  ):
from scipy.stats import gaussian_kde
import numpy as np
kde = gaussian_kde(data)
density = kde(data)     # estimate the local density at each datapoint

# generate some random jitter between 0 and 1
jitter = np.random.rand(*data.shape) - 0.5 

# scale the jitter by the KDE estimate and add it to the centre x-coordinate
xvals = xpos + (density * jitter * width * 2)

return scatter(xvals, data, s=s, c=c, marker=marker, cmap=cmap, norm=norm, vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths, verts=verts, hold=hold,label=label, **kwargs)

1 个答案:

答案 0 :(得分:2)

在spark-env.sh中,

添加

导出SPARK_DIST_CLASSPATH = $(/ path / to / hadoop / bin / hadoop classpath)

并确保jersey-servlet-1.9.jar位于类路径上。

浏览所有* -env.sh脚本,并尽可能明确地设置环境变量,检查每个变量,然后检查日志中的错误。

cd / 找 。 -name jersey-servlet-1.9.jar 并确保找到此文件的路径位于类路径

编辑: 将jersey-server-1.9.jar添加到$ MAHOUT_HOME / lib /目录。