使用PIG将collect_set值插入Elasticsearch

时间:2017-11-27 22:48:11

标签: hadoop elasticsearch hive apache-pig

我有一个HIVE表,其中包含3列 - " id"(字符串)," booklist"(字符串数组)和" date"(字符串)具有以下数据:

----------------------------------------------------
id | booklist            | date
----------------------------------------------------
1  | ["Book1" , "Book2"] | 2017-11-27T01:00:00.000Z
2  | ["Book3" , "Book4"] | 2017-11-27T01:00:00.000Z

尝试使用此PIG脚本插入Elasticsearch时

-------------------------Script begins------------------------------------------------
 SET hive.metastore.uris 'thrift://node:9000';
 REGISTER hdfs://node:9001/library/elasticsearch-hadoop-5.0.0.jar;
 DEFINE HCatLoader org.apache.hive.hcatalog.pig.HCatLoader();
 DEFINE EsStore org.elasticsearch.hadoop.pig.EsStorage(
 'es.nodes = elasticsearch.service.consul',
 'es.port = 9200',
 'es.write.operation = upsert',
 'es.mapping.id = id',
 'es.mapping.pig.tuple.use.field.names=true'
 );

hivetable = LOAD 'default.reading' USING HCatLoader();
hivetable_flat = FOREACH hivetable 
                 GENERATE 
                   id AS id, 
                   booklist as bookList,
                   date AS date;
STORE hivetable_flat INTO 'readings/reading' USING EsStore();
 -------------------------Script Ends------------------------------------------------

上面运行时,我收到错误消息:

ERROR 2999:Unexpected internal error. Found unrecoverable error [ip:port] returned Bad Request(400) - failed to parse [bookList]; Bailing out..
  1. 任何人都可以解释如何将STRING ARRAY解析为ES并开始工作吗?
  2. 谢谢!

0 个答案:

没有答案