AWS ElasticSearch Service在文档插入时禁止403

时间:2017-06-01 06:26:14

标签: java amazon-web-services hadoop elasticsearch hive

我正在尝试使用配置单元查询将数据上传到AWS ElasticSearch。 Hive查询中使用的中间表具有以下结构。

CREATE TABLE uid_taxonomy_1_day_es(
UID String
,VISITS INT
,TAXONOMY_LABEL_1 INT
,TAXONOMY_LABEL_2 INT
,TAXONOMY_LABEL_3 INT
,DAY_OF_YEAR INT)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.nodes' = 'https://search-cust-360-view-
6cj2j4mcytbtrpodzffnh3uadi.us-east-1.es.amazonaws.com', 'es.port' = '443', 
'es.index.auto.create' = 'false', 'es.batch.size.entries' = '1000', 
'es.batch.write.retry.count' = '10000', 'es.batch.write.retry.wait' = '10s', 
'es.batch.write.refresh' = 'false','es.nodes.discovery' = 
'false','es.nodes.client.only' = 'false', 'es.resource' = 
'urltaxonomy/uids', 'es.query' = '?q=*', 'es.nodes.wan.only' = 'true');

实际加载查询在加载数据时大约需要半小时。 所以在一些随机记录之间它会抛出403禁止错误。

这是堆栈跟踪。

Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: [HEAD] on [urltaxonomy] failed; server[https://search-cust-360-view-6cj2j4mcytbtrpodzffnh3uadi.us-east-1.es.amazonaws.com:443] returned [403|Forbidden:]
at org.elasticsearch.hadoop.rest.RestClient.checkResponse(RestClient.java:505)
at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:476)
at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:537)
at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:543)
at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:412)
at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:606)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:594)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)

错误只是间歇性地发生,这意味着如果我尝试插入失败的记录,它就可以正常工作。

我在想我的中间表结构。

0 个答案:

没有答案