Hive Merge的问题-基数违反

时间:2018-10-02 08:29:06

标签: merge hive hortonworks-data-platform

我有以下合并声明:

"/Users/admin/Library/Developer/CoreSimulator/Devices/749E7AAA-F568-4F56-9606-FF23E84946CF/data/Containers/Bundle/Application/95C3997D-BBB3-4FB3-94C4-ADB0331940B2/Bus_Booking.app/MYFILE.xlsx"

其中a1是唯一整数,该语句的意思是当源表具有新行然后插入到目标表中,并且当该行存在于目标表中时,然后更新目标表中的行。运行该语句时,出现以下错误:

  MERGE INTO          `s1`.`t1`   `t`
        USING               
        ( Select `q`.`a1`,
                             `q`.`a2`
        FROM (
         SELECT              `r`.`a1`,
                             `r`.`a2`,
                              ROW_NUMBER()
                                 OVER ( PARTITION BY `r`.a1
                                                   ORDER BY 
                            CAST(`r`.`a3` AS STRING)     DESC
                                                 ) `rank`
                                FROM                `s2`.`t2`)   q   
                               where q.rank = 1

        )  `s`
                         ON ( `t`.`a1` = `s`.`a1`)

 WHEN MATCHED THEN UPDATE 
    SET            ...
WHEN NOT MATCHED THEN INSERT VALUES ..

我从源头测试了选择内容,没有重复。 尽管在合并中增加了限制,但解决了该问题(源表中的记录少于10000条)。知道为什么或如何找到真正的问题吗?

Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":{"transactionid":A,"bucketid":B,"rowid":C}},"value":{"_col0":2}}
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
    at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:266)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
    ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":{"transactionid":A,"bucketid":B,"rowid":C}},"value":{"_col0":2}}
    at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
    at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
    ... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:86)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
    at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:122)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1022)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:827)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:701)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:767)
    at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
    ... 17 more

0 个答案:

没有答案