我正在尝试使用Spark / Shark群集,但仍然遇到同样的问题。 我按照https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster上的说明进行了操作,并按照规定处理了Hive。
我认为Shark Driver正在推出另一个版本的Hadoop jar,但我不确定原因。
以下是详细信息,任何帮助都会很棒。
Spark / Shark 0.9.0
Apache Hadoop 2.3.0
Amplabs Hive 0.11
Scala 2.10.3
Java 7
我安装了所有内容,但我得到了一些弃用警告,然后是例外:
14/03/14 11:24:47 INFO Configuration.deprecation:不推荐使用mapred.input.dir.recursive。而是使用mapreduce.input.fileinputformat.input.dir.recursive
14/03/14 11:24:47 INFO Configuration.deprecation:不推荐使用mapred.max.split.size。而是使用mapreduce.input.fileinputformat.split.maxsize
例外:
Exception in thread "main" org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1072)
at shark.memstore2.TableRecovery$.reloadRdds(TableRecovery.scala:49)
at shark.SharkCliDriver.<init>(SharkCliDriver.scala:275)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1139)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:51)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2288)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2299)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1070)
... 4 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1137)
... 9 more
Caused by: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
答案 0 :(得分:1)
我有同样的问题,我认为这是由hadoop / hive和spark / shark的不兼容版本引起的。
您需要:
shark/lib_managed/jars/org.apache.hadoop/hadoop-core/
构建鲨鱼时,请按如下方式明确设置SHARK_HADOOP_VERSION
:
cd shark;
SHARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0 ./sbt/sbt clean
SHARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0 ./sbt/sbt package
第二种方法也为我解决了其他问题。您还可以查看此主题以获取更多详细信息:https://groups.google.com/forum/#!msg/shark-users/lTNPcxHJiOQ/EqzyByZrzQMJ