在S3上创建Hive外部表会抛出“未找到org.apache.hadoop.fs.s3a.S3AFileSystem”异常

时间:2017-12-29 01:59:06

标签: hadoop amazon-s3 hive

我在本地计算机上使用beeline在DDL下运行,并抛出异常。

DDL

JarFile

例外是

CREATE TABLE `report_landing_pages`( `google_account_id` string COMMENT 'from deserializer', `ga_view_id` string COMMENT 'from deserializer', `path` string COMMENT 'from deserializer', `users` string COMMENT 'from deserializer', `page_views` string COMMENT 'from deserializer', `event_value` string COMMENT 'from deserializer', `report_date` string COMMENT 'from deserializer') PARTITIONED BY (`dt` date) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' STORED AS TEXTFILE LOCATION 's3a://bucket_name/table'

我的本​​地HDFS可以正常使用“hdfs dfs -mkdir s3a:// bucket / table”

而且奇怪的是,如果我先创建不在S3上的表,然后在以后手动将表的位置更新到meta3中的s3,那么select语句就像 org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257) at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:862) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:867) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4356) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255) ... 11 more Caused by: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42070) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42038) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:41964) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1199) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1185) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2399) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:752) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:740) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy34.createTable(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2330) at com.sun.proxy.$Proxy34.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:852) ... 22 more 工作正常。

如何修复DDL中的异常?

顺便说一下,我正在使用Hive 2.3.2,在MacOS X EI Caption下使用Hadoop 2.7.5。

4 个答案:

答案 0 :(得分:1)

问题解决了。

放置S3 jar后,除了hiveserver2之外,还应重新启动Metastore服务。

答案 1 :(得分:0)

我尝试在我的环境中创建相同的表,但它确实有效。

检查您的:fs.s3a.access.key

"fs.s3a.secret.key","fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem" 

配置文件中的属性。

答案 2 :(得分:0)

您是否将所有必需的罐子s3等放到了类路径中? (org.apache.hadoop.fs.s3a.S3AFileSystem) - 这是一个Hadoop类,可以在Hadoop-aws jar中找到。报告其中一个类的异常意味着此jar不在类路径中。

答案 3 :(得分:0)

是的,这是正确的。放置jar后,需要重新启动hive Metastore服务

要重新启动hive Metastore,需要遵循以下步骤:

1. ps -ef | grep 'hive'

使用上述命令识别配置单元正在使用的进程ID(PID)。默认情况下,配置单元使用9083端口号,因此对于正在运行的配置单元播放服务,您还可以使用lsof -i:9083命令检查PID。

2. kill <process number>

这会杀死现有的hive Metastore服务。

3. hive --service metastore

此命令再次启动Metastore服务。

OR 如果您使用的是hive-server 2,请使用以下命令:

$ sudo /etc/init.d/hive-metastore start
$ sudo /etc/init.d/hive-metastore stop