我使用Zeppelin笔记本和Spark1.3.1以及Hadoop 2.6版。我能够运行整个教程,没有任何问题。
然后我创建了新的笔记本来运行简单的代码,从本地机器上的HDFS中存储的镶木地板文件中获取数据。以下是代码:
val param_alarms = sqlc.parquetFile("hdfs://localhost/ge_alarm/alarm_param")
param_alarms.registerTempTable("Alarm")
param_alarms.count()
此操作失败,并显示以下错误消息:
param_alarms: org.apache.spark.sql.DataFrame = [PARAMETER_ID: int, PARAMETER_NAME: string, OSM_NAME: string, DESCRIPTION: string, TIMESTAMP: bigint, Value: int]
java.lang.NoSuchMethodError: org.json4s.JsonDSL$.string2jvalue(Ljava/lang/String;)Lorg/json4s/JsonAST$JValue;
at org.apache.spark.sql.types.StructType$$anonfun$jsonValue$5.apply(dataTypes.scala:1065)
at org.apache.spark.sql.types.StructType$$anonfun$jsonValue$5.apply(dataTypes.scala:1065)
at org.json4s.JsonDSL$JsonAssoc.$tilde(JsonDSL.scala:86)
at org.apache.spark.sql.types.StructType.jsonValue(dataTypes.scala:1065)
at org.apache.spark.sql.types.StructType.jsonValue(dataTypes.scala:1015)
at org.apache.spark.sql.types.DataType.json(dataTypes.scala:265)
at org.apache.spark.sql.parquet.ParquetTypesConverter$.convertToString(ParquetTypes.scala:404)
at org.apache.spark.sql.parquet.ParquetRelation2.buildScan(newParquet.scala:437)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$1.apply(DataSourceStrategy.scala:38)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$1.apply(DataSourceStrategy.scala:38)
at org.apache.spark.sql.sources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:107)
at org.apache.spark.sql.sources.DataSourceStrategy$.apply(DataSourceStrategy.scala:34)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.planLater(QueryPlanner.scala:54)
at org.apache.spark.sql.execution.SparkStrategies$HashAggregation$.apply(SparkStrategies.scala:152)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:1081)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:1079)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:1085)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:1085)
at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:815)
at org.apache.spark.sql.DataFrame.count(DataFrame.scala:827)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:48)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:50)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:52)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:54)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:56)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:58)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:60)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:62)
at $iwC$$iwC$$iwC.<init>(<console>:64)
at $iwC$$iwC.<init>(<console>:66)
at $iwC.<init>(<console>:68)
at <init>(<console>:70)
at .<init>(<console>:74)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:582)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:558)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:551)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:277)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
我还尝试添加以下依赖项:
%dep
z.reset()
z.load("org.json4s:json4s-native_2.11:3.2.10")
当我尝试运行%sql select * from Alarm
时,我会收到以下错误:
java.lang.reflect.InvocationTargetException
也没有运气。有人可以帮忙吗?
一位同事终于找到了解决方案,导致它与json4s不兼容。如果其他人面临同样的问题。它是由不兼容的json4s引起的。 要解决此问题,我需要更新以下内容:
修改zeppelin-server / pom.xml以使用swagger-jersey-jaxrs_2.10版本1.3.11
修改zeppelin-engine / pom.xml以使用org.reflections版本0.9.9-RC1
导出MAVEN_OPTS =&#39; -Xmx2048m -XX:MaxPermSize = 2048m&#39;
编译hadoop2.6和Spark1.3 mvn clean package -Pspark-1.3 -Dhadoop.version = 2.6.0 -Phadoop-2.6 -DskipTests
答案 0 :(得分:1)
一位同事终于找到了解决方案,导致它与json4s不兼容。如果其他人面临同样的问题。它是由不兼容的json4s引起的。要解决此问题,我需要更新以下内容:
修改zeppelin-server / pom.xml以使用swagger-jersey-jaxrs_2.10版本1.3.11
修改zeppelin-engine / pom.xml以使用org.reflections版本0.9.9-RC1
导出MAVEN_OPTS =&#39; -Xmx2048m -XX:MaxPermSize = 2048m&#39;
编译hadoop2.6和Spark1.3 mvn clean包-Pspark-1.3 -Dhadoop.version = 2.6.0 -Phadoop-2.6 -DskipTests