我正在尝试使用Intellij测试我的spark scala代码,这将需要创建一个配置单元表。我已经在MAC上本地使用mysql驱动程序安装了蜂巢。我可以使用
从spark-shell创建一个配置单元表sqlContext.sql("CREATE TABLE IF NOT EXISTS employee(id INT, name STRING, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
但是,即使成功运行成功,Intellij的scala程序中的同一命令也无法真正创建任何显示在配置单元metastore上的表。
val spark = SparkSession.builder
.appName("BiddingExternalTable")
.master("local")
.enableHiveSupport()
.getOrCreate()
spark.sqlContext.sql("CREATE TABLE IF NOT EXISTS employeeExternal(id INT, name STRING, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
从控制台输出看,Intellij内部的spark会话仍在使用默认的DERBY元存储。
19/05/02 17:40:06 INFO SharedState: Warehouse path is 'file:/Users/sichu/src/MktDataSSS/spark-warehouse/'.
19/05/02 17:40:07 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
19/05/02 17:40:09 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
19/05/02 17:40:09 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
19/05/02 17:40:09 INFO ObjectStore: ObjectStore, initialize called
19/05/02 17:40:10 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
19/05/02 17:40:10 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
19/05/02 17:40:10 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
19/05/02 17:40:11 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
19/05/02 17:40:11 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
19/05/02 17:40:11 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
19/05/02 17:40:11 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
19/05/02 17:40:11 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
19/05/02 17:40:11 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
尽管我已将JDBC驱动程序(及其文件夹)添加到CLASSPATH。我还将hive-site.xml文件放置在hadoop conf目录中。此hive-site.xml已由spark-shell成功拾取,但从Intellij内部运行scala程序时未成功。
有人可以帮助我将Intellij中的spark作业连接到我在本地计算机上设置的mysql hive metastore中。谢谢!
答案 0 :(得分:0)
您应该分配元存储位置
val spark = SparkSession
.builder()
.master("yarn")
.appName("Test Hive Support")
.config("hive.metastore.uris", "jdbc:mysql://localhost/metastore")
//or .config("hive.metastore.uris", "thrift://localhost:9083")
.enableHiveSupport
.getOrCreate();