“无法获取输入拆分”错误,使用Hive-Cassandra-CqlStorageHandler

时间:2014-02-18 11:58:34

标签: hadoop cassandra hive hiveql

我试图使用Hive with CqlStorageHandler从cassandra读取数据。

版本:

Hive 0.11.0
Hadoop 1.2.1
Cassandra 1.2.6

我能够使用以下HIVE查询创建EXTERNAL表

CREATE EXTERNAL TABLE输入(数字字符串,名称字符串,地址字符串)STORED BY' org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' with SERDEPROPERTIES(" cassandra.columns.mapping" =&#34 ;:key,name,address"," cassandra.ks.name" =" cassandradb&#34 ;," cassandra.host" =" localhost"," cassandra.port" =" 9160")TBLPROPERTIES(" cassandra。 input.split.size" =" 64000"," cassandra.range.size" =" 1000"," cassandra.slice.predicate。 size" =" 1000");

(表"输入"已存在并包含使用CQL3创建的cassandra中的一些数据)

但是,当我尝试使用以下查询读取数据时

从输入中选择*,其中number =" 1&#34 ;;

我正面临着以下问题:

总MapReduce工作= 1 从1开始工作1

减少任务的数量设置为0,因为没有减少运算符
java.io.IOException:无法获得输入拆分     在org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:189)     在org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:213)     在org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:169)     at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)     at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:297)     在org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)     在org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)     在org.apache.hadoop.mapred.JobClient.access $ 700(JobClient.java:179)     在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:983)     在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:936)     at java.security.AccessController.doPrivileged(Native Method)     在javax.security.auth.Subject.doAs(Subject.java:415)     在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)     在org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)     在org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)     在org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)     在org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)     在org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)     在org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)     在org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)     在org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)     在org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)     在org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)     在org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)     在org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)     在org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)     在org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:606)     在org.apache.hadoop.util.RunJar.main(RunJar.java:160) 引起:java.util.concurrent.ExecutionException:java.lang.NumberFormatException:对于输入字符串:" 143514173170822869679056708180186660043"     at java.util.concurrent.FutureTask.report(FutureTask.java:122)     在java.util.concurrent.FutureTask.get(FutureTask.java:188)     在org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:185)     ......还有31个 引起:java.lang.NumberFormatException:对于输入字符串:" 143514173170822869679056708180186660043"     at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)     在java.lang.Long.parseLong(Long.java:444)     在java.lang.Long.valueOf(Long.java:540)     在org.apache.cassandra.dht.Murmur3Partitioner $ 1.fromString(Murmur3Partitioner.java:188)     在org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat $ SplitCallable.call(AbstractColumnFamilyInputFormat.java:239)     在org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat $ SplitCallable.call(AbstractColumnFamilyInputFormat.java:207)     在java.util.concurrent.FutureTask.run(FutureTask.java:262)     在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)     at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)     在java.lang.Thread.run(Thread.java:744) 作业提交失败,异常' java.io.IOException(无法获得输入拆分)' FAILED:执行错误,从org.apache.hadoop.hive.ql.exec.MapRedTask返回代码1

我错过了什么吗?请建议。

0 个答案:

没有答案