Question

这是我的代码：

 SparkConf sparkConf = new SparkConf().setAppName("Appname").setMaster("local[2]"); 
    ctx = new JavaSparkContext(sparkConf); 
    SQLContext hc = new HiveContext(ctx.sc());
    String result = hc.sql("select count(*) from health").collect().toString();
    System.out.print(result);

这是不允许我的程序运行的例外：

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/aims/hadoop/hadoop/spark/lib/spark-assembly-1.6.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/aims/hadoop/hadoop/spark/lib/spark-examples-1.6.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/02/16 19:52:05 INFO SparkContext: Running Spark version 1.6.0
17/02/16 19:52:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/16 19:52:06 WARN Utils: Your hostname, aims resolves to a loopback address: 127.0.1.1; using 10.0.0.3 instead (on interface wlp2s0)
17/02/16 19:52:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/02/16 19:52:06 INFO SecurityManager: Changing view acls to: aims
17/02/16 19:52:06 INFO SecurityManager: Changing modify acls to: aims
17/02/16 19:52:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aims); users with modify permissions: Set(aims)
17/02/16 19:52:08 INFO Utils: Successfully started service 'sparkDriver' on port 37954.
17/02/16 19:52:08 INFO Slf4jLogger: Slf4jLogger started
17/02/16 19:52:08 INFO Remoting: Starting remoting
17/02/16 19:52:10 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.0.3:42090]
17/02/16 19:52:10 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 42090.
17/02/16 19:52:11 INFO SparkEnv: Registering MapOutputTracker
17/02/16 19:52:11 INFO SparkEnv: Registering BlockManagerMaster
17/02/16 19:52:11 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-2017cec9-8176-4e77-9f4e-599313a243ac
17/02/16 19:52:11 INFO MemoryStore: MemoryStore started with capacity 419.3 MB
17/02/16 19:52:11 INFO SparkEnv: Registering OutputCommitCoordinator
17/02/16 19:52:21 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/02/16 19:52:21 INFO SparkUI: Started SparkUI at http://10.0.0.3:4040
17/02/16 19:52:22 INFO Executor: Starting executor ID driver on host localhost
17/02/16 19:52:22 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39952.
17/02/16 19:52:22 INFO NettyBlockTransferService: Server created on 39952
17/02/16 19:52:22 INFO BlockManagerMaster: Trying to register BlockManager
17/02/16 19:52:22 INFO BlockManagerMasterEndpoint: Registering block manager localhost:39952 with 419.3 MB RAM, BlockManagerId(driver, localhost, 39952)
17/02/16 19:52:22 INFO BlockManagerMaster: Registered BlockManager
17/02/16 19:52:23 INFO HiveContext: Initializing execution hive, version 1.2.1
17/02/16 19:52:23 INFO ClientWrapper: Inspected Hadoop version: 2.6.0
17/02/16 19:52:23 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
17/02/16 19:52:24 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/02/16 19:52:24 INFO ObjectStore: ObjectStore, initialize called
17/02/16 19:52:24 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/02/16 19:52:24 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/02/16 19:52:24 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
17/02/16 19:52:25 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
17/02/16 19:52:35 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
17/02/16 19:52:38 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:38 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:44 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:44 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:46 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/02/16 19:52:46 INFO ObjectStore: Initialized ObjectStore
17/02/16 19:52:47 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
17/02/16 19:52:47 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
17/02/16 19:52:48 INFO HiveMetaStore: Added admin role in metastore
17/02/16 19:52:48 INFO HiveMetaStore: Added public role in metastore
17/02/16 19:52:48 INFO HiveMetaStore: No user is added in admin role, since config is empty
17/02/16 19:52:48 INFO HiveMetaStore: 0: get_all_databases
17/02/16 19:52:48 INFO audit: ugi=aims  ip=unknown-ip-addr  cmd=get_all_databases   
17/02/16 19:52:48 INFO HiveMetaStore: 0: get_functions: db=default pat=*
17/02/16 19:52:48 INFO audit: ugi=aims  ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
17/02/16 19:52:48 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:50 INFO SessionState: Created local directory: /tmp/1fef0bb5-3d3a-47f4-b7ee-594e7a9976f5_resources
17/02/16 19:52:50 INFO SessionState: Created HDFS directory: /tmp/hive/aims/1fef0bb5-3d3a-47f4-b7ee-594e7a9976f5
17/02/16 19:52:50 INFO SessionState: Created local directory: /tmp/aims/1fef0bb5-3d3a-47f4-b7ee-594e7a9976f5
17/02/16 19:52:50 INFO SessionState: Created HDFS directory: /tmp/hive/aims/1fef0bb5-3d3a-47f4-b7ee-594e7a9976f5/_tmp_space.db
17/02/16 19:52:50 INFO HiveContext: default warehouse location is /user/hive/warehouse
17/02/16 19:52:50 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
17/02/16 19:52:50 INFO ClientWrapper: Inspected Hadoop version: 2.6.0
17/02/16 19:52:50 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
17/02/16 19:52:51 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/02/16 19:52:51 INFO ObjectStore: ObjectStore, initialize called
17/02/16 19:52:51 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/02/16 19:52:51 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/02/16 19:52:51 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
17/02/16 19:52:51 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
17/02/16 19:52:52 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
17/02/16 19:52:53 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:53 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:54 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
17/02/16 19:52:54 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/02/16 19:52:54 INFO ObjectStore: Initialized ObjectStore
17/02/16 19:52:54 INFO HiveMetaStore: Added admin role in metastore
17/02/16 19:52:54 INFO HiveMetaStore: Added public role in metastore
17/02/16 19:52:54 INFO HiveMetaStore: No user is added in admin role, since config is empty
17/02/16 19:52:54 INFO HiveMetaStore: 0: get_all_databases
17/02/16 19:52:54 INFO audit: ugi=aims  ip=unknown-ip-addr  cmd=get_all_databases   
17/02/16 19:52:54 INFO HiveMetaStore: 0: get_functions: db=default pat=*
17/02/16 19:52:54 INFO audit: ugi=aims  ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
17/02/16 19:52:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
17/02/16 19:52:54 INFO SessionState: Created local directory: /tmp/aa614604-a04e-4663-8128-2777bed53de8_resources
17/02/16 19:52:54 INFO SessionState: Created HDFS directory: /tmp/hive/aims/aa614604-a04e-4663-8128-2777bed53de8
17/02/16 19:52:54 INFO SessionState: Created local directory: /tmp/aims/aa614604-a04e-4663-8128-2777bed53de8
17/02/16 19:52:54 INFO SessionState: Created HDFS directory: /tmp/hive/aims/aa614604-a04e-4663-8128-2777bed53de8/_tmp_space.db
17/02/16 19:52:55 INFO ParseDriver: Parsing command: select count(*) from health
17/02/16 19:52:56 INFO ParseDriver: Parse Completed
17/02/16 19:52:56 INFO HiveMetaStore: 0: get_table : db=default tbl=health
17/02/16 19:52:56 INFO audit: ugi=aims  ip=unknown-ip-addr  cmd=get_table : db=default tbl=health   
Exception in thread "main" org.apache.spark.sql.AnalysisException: Table not found: health; line 1 pos 21
    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:306)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$9.applyOrElse(Analyzer.scala:315)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$9.applyOrElse(Analyzer.scala:310)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:56)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:54)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:54)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
    at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
    at scala.collection.AbstractIterator.to(Iterator.scala:1157)
    at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
    at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
    at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
    at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:305)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:54)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:310)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:300)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:83)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:80)
    at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
    at scala.collection.immutable.List.foldLeft(List.scala:84)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:80)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:72)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:36)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:36)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
    at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
    at sparkhive.hive.queryhive.main(queryhive.java:31)
17/02/16 19:52:56 INFO SparkContext: Invoking stop() from shutdown hook
17/02/16 19:52:56 INFO SparkUI: Stopped Spark web UI at http://10.0.0.3:4040
17/02/16 19:52:56 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/02/16 19:52:56 INFO MemoryStore: MemoryStore cleared
17/02/16 19:52:56 INFO BlockManager: BlockManager stopped
17/02/16 19:52:56 INFO BlockManagerMaster: BlockManagerMaster stopped
17/02/16 19:52:56 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/02/16 19:52:56 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/02/16 19:52:56 INFO SparkContext: Successfully stopped SparkContext
17/02/16 19:52:56 INFO ShutdownHookManager: Shutdown hook called
17/02/16 19:52:56 INFO ShutdownHookManager: Deleting directory /tmp/spark-4bc83668-fb16-4b27-87bc-219b221e178f
17/02/16 19:52:56 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
17/02/16 19:52:56 INFO ShutdownHookManager: Deleting directory /tmp/spark-94f489b4-608c-4f6b-98cc-e482f8d72855

我已将hive-site.xml，core-site.xml，hdfs-site.xml和hive-default.xml复制到Spark conf文件夹中。但问题仍然存在。
我编写了一个用于访问Hive表的java代码。但例外情况表明找不到该表。我该怎么做才能顺利运行我的程序？

我正在使用 eclipse neon和 Spark 2.1.0

Answer 1

如果您使用的是Spark 2.1.0，则应将SparkSession与enableHiveSupport一起使用。请参阅spark文档中的Hive Tables。

SparkSession spark = SparkSession
    .builder()
    .appName("Java Spark Hive Example")
    .master("local[*]")
    .config("spark.sql.warehouse.dir", warehouseLocation)
    .enableHiveSupport()
    .getOrCreate();

spark.sql("select count(*) from health").show();

您是如何运行此应用程序的？如果从eclipse运行该类，则将配置文件（hive-site.xml）放入maven项目的resources文件夹中。从spark开发人员列表中查看How is hive-site.xml loaded?。

为什么在使用带有Hive的SparkSql时找不到此表错误？

1 个答案: