通过Kylin构建多维数据集创建配置单元表时出错

时间:2018-11-19 15:16:38

标签: hadoop hive kylin

嗨,我正在尝试使用Kylin构建一个多维数据集,从sqoop可以很好地获取数据,但是创建配置单元表的下一步失败。看着被触发的命令,它看起来很奇怪,因为create语句对我来说很好。

我认为问题出在DOUBLE类型上,因为当我删除相同的create语句时,它工作正常。有人可以帮忙吗。

我在AWS EMR中使用堆栈,kylin 2.5配置单元2.3.0

使用以下命令记录错误

命令

    hive -e "USE default;
DROP TABLE IF EXISTS kylin_intermediate_fm_inv_holdings_8a1c33df_d12b_3609_13ee_39e169169368;
CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_fm_inv_holdings_8a1c33df_d12b_3609_13ee_39e169169368
(
HOLDINGS_STOCK_INVESTOR_ID string
,STOCK_INVESTORS_CHANNEL string
,STOCK_STOCK_ID string
,STOCK_DOMICILE string
,STOCK_STOCK_NM string
,STOCK_APPROACH string
,STOCK_STOCK_TYP string
,INVESTOR_ID string
,INVESTOR_NM string
,INVESTOR_DOMICILE_CNTRY string
,CLIENT_NM string
,INVESTOR_HOLDINGS_GROSS_ASSETS_USD double(22)
,INVESTOR_HOLDINGS_NET_ASSETS_USD double(22)
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 's3://wfg1tst-models/kylin/kylin_metadata/kylin-4ae3b18b-831b-da66-eb8c-7318245c4448/kylin_intermediate_fm_inv_holdings_8a1c33df_d12b_3609_13ee_39e169169368';
ALTER TABLE kylin_intermediate_fm_inv_holdings_8a1c33df_d12b_3609_13ee_39e169169368 SET TBLPROPERTIES('auto.purge'='true');

" --hiveconf hive.merge.mapredfiles=false --hiveconf hive.auto.convert.join=true --hiveconf dfs.replication=2 --hiveconf hive.exec.compress.output=true --hiveconf hive.auto.convert.join.noconditionaltask=true --hiveconf mapreduce.job.split.metainfo.maxsize=-1 --hiveconf hive.merge.mapfiles=false --hiveconf hive.auto.convert.join.noconditionaltask.size=100000000 --hiveconf hive.stats.autogather=true

错误如下

OK
Time taken: 1.315 seconds
OK
Time taken: 0.09 seconds
MismatchedTokenException(334!=347)
    at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
    at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
    at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6179)
    at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:3808)
    at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2382)
    at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
    at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
    at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
    at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1316)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1456)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1226)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 15:42 mismatched input '(' expecting ) near 'double' in create table statement

2 个答案:

答案 0 :(得分:1)

哦,我想我终于想通了,似乎Hive不支持DOUBLE。但是我认为Kylin在将jdbc元数据导入模型中时应该注意这一点。

将在Kylin上进行相同的增强或错误处理。

答案 1 :(得分:0)

我在测试环境中安装了最新的稳定配置单元,版本是2.3.4(配置单元DDL文档说2.2.0支持双精度)。我发现蜂巢不支持特定/用户定义精度的double。

例如

  • [错误]创建表haha(ha double 10);
  • [错误]创建表haha(ha double Precision 17);
  • [错误]创建表haha(ha double(22));
  • [CORRECT]创建表haha(具有双精度);

Hive CMD:

hive> use ss;
OK
Time taken: 0.015 seconds
hive> create table haha(ha double 10);
FAILED: ParseException line 1:28 extraneous input '10' expecting ) near '<EOF>'
hive> create table haha(ha double Precision 17);
FAILED: ParseException line 1:38 extraneous input '17' expecting ) near '<EOF>'
hive> create table haha(ha double(22));
MismatchedTokenException(334!=347)
at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6179)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:3808)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2382)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:27 mismatched input '(' expecting ) near 'double' in create table statement
hive> create table haha(ha double Precision);
OK
Time taken: 0.625 seconds

因此,据我所知,Hive不支持此类数据类型定义。 kylin开发人员无法解决该问题。

如果发现任何错误,请告诉我。我发现以下链接可能会有所帮助:

  1. https://cwiki.apache.org/confluence/display/hive/LanguageManual+DDL#LanguageManualDDL-CreateTableCreate/Drop/TruncateTable

  2. https://issues.apache.org/jira/browse/HIVE-13556