我想让Mahout Spark-Shell在Cloudera QuickStart VM上运行
export MAHOUT_HOME=/home/cloudera/Desktop/Mahout_0_11_1
export MAHOUT_LOCAL=true
export SPARK_HOME=/usr/lib/spark
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
.bashrc设置为
java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer
at org.apache.spark.status.api.v1.ApiRootResource$.getServletHandler(ApiRootResource.scala:187)
at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:68)
at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:74)
at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:190)
at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:141)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:466)
at org.apache.mahout.sparkbindings.package$.mahoutSparkContext(package.scala:91)
at org.apache.mahout.sparkbindings.shell.MahoutSparkILoop.createSparkContext(MahoutSparkILoop.scala:89)
...
当我运行Mahout Spark-Shell时,我收到以下错误消息。
Mahout distributed context is available as "implicit val sdc".
java.lang.NullPointerException
at org.apache.spark.sql.execution.ui.SQLListener.<init>(SQLListener.scala:34)
at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:77)
at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1033)
at $iwC$$iwC.<init>(<console>:11)
at $iwC.<init>(<console>:19)
然后是:
def viol_jitter(data,xpos = 0, width = 0.8 , s=20, c='b', marker='o', cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, label=None, **kwargs ):
from scipy.stats import gaussian_kde
import numpy as np
kde = gaussian_kde(data)
density = kde(data) # estimate the local density at each datapoint
# generate some random jitter between 0 and 1
jitter = np.random.rand(*data.shape) - 0.5
# scale the jitter by the KDE estimate and add it to the centre x-coordinate
xvals = xpos + (density * jitter * width * 2)
return scatter(xvals, data, s=s, c=c, marker=marker, cmap=cmap, norm=norm, vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths, verts=verts, hold=hold,label=label, **kwargs)
答案 0 :(得分:2)
在spark-env.sh中,
添加
导出SPARK_DIST_CLASSPATH = $(/ path / to / hadoop / bin / hadoop classpath)
并确保jersey-servlet-1.9.jar位于类路径上。
浏览所有* -env.sh脚本,并尽可能明确地设置环境变量,检查每个变量,然后检查日志中的错误。
cd / 找 。 -name jersey-servlet-1.9.jar 并确保找到此文件的路径位于类路径
编辑: 将jersey-server-1.9.jar添加到$ MAHOUT_HOME / lib /目录。