HiveException(ClassCastException)使用“group by”在wso2 bam中进行聚合

时间:2013-09-20 08:19:52

标签: wso2 hive

...

使用wso2bam(v.2.3.0),以下查询无法执行:

insert overwrite table tab_summarized_parse_sessions
       select parseId, count(*) from tab_session_info group by parseId;

日志输出:

TID: [0] [BAM] [2013-09-20 12:38:49,783] FATAL {ExecMapper} -  org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"mid":"1379599795585::192.168.2.17::9443::1","sessionid":null,"parseid":40}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:211)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.CassandraLazyLong cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyLong

...

TID: [0] [BAM] [2013-09-20 12:38:50,606] ERROR {org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl} -  Error while executing Hive script.
Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask {org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl}
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
    at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189)
    at org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl$ScriptCallable.call(HiveExecutorServiceImpl.java:355)
    at org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl$ScriptCallable.call(HiveExecutorServiceImpl.java:250)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)

脚本执行结果:

ERROR: Error while executing Hive script.Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

似乎使用查询的“分组依据”部分存在问题;顺便说一句,这里有关于Hive表等的更多信息......

CREATE EXTERNAL TABLE IF NOT EXISTS tab_session_info 
(
    mid STRING, sessionId STRING, parseId BIGINT
)
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ( 
"cassandra.host" = "127.0.0.1" ,
"cassandra.port" = "9160" ,
"cassandra.ks.name" = "EVENT_KS" ,
"cassandra.ks.username" = "admin" ,
"cassandra.ks.password" = "admin" ,
"cassandra.cf.name" = "session_main" ,
"cassandra.columns.mapping" = ":key,payload_sessionId, payload_parseId" );


CREATE EXTERNAL TABLE IF NOT EXISTS tab_summarized_parse_sessions (parseId BIGINT, sessionCount INT)
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'
TBLPROPERTIES ( 
'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver' , 
'mapred.jdbc.url' = 'jdbc:mysql://localhost/MYSQL_DB_NAME' , 
'mapred.jdbc.username' = 'MYSQL_USER' , 
'mapred.jdbc.password' = 'MYSQL_PASS' , 
'hive.jdbc.update.on.duplicate' = 'true' ,
'hive.jdbc.primary.key.fields' = 'parseId' ,
'hive.jdbc.table.create.query' = 'CREATE TABLE IF NOT EXISTS summarized_parse_sessions
                                (parseId BIGINT NOT NULL PRIMARY KEY, sessionCount  INT )');

提前感谢:)

0 个答案:

没有答案