什么是152³305746

时间:2018-04-13 19:48:41

标签: hadoop hive

我收到152³305746作为输入文件中的一些错误数据。我试图过滤它,但没有成功告诉蜂巢如何检测和过滤。我甚至不确定它是什么数据类型。我期待bigint值而不是我在这里看到的值。

我已经尝试了以下的东西以及它们的各种组合,这些组合无助于跳过我输入中的一些不良记录:

1) CAST(mycol AS string) RLIKE "^[0-9]+$"

2) mycol < 2147483647

3) CREATE TEMPORARY MACRO isNumber(s string) CAST(s as BIGINT) IS NOT NULL;

4) isNumber(mycol) != false

5) SET mapreduce.map.skip.maxrecords = 100000000;

以上方法均无效。 Hive失败并出现以下错误:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: For input string: "152³305746"
        at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:416)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
        at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
        at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
        ... 9 more
Caused by: java.lang.NumberFormatException: For input string: "152³305746"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Long.parseLong(Long.java:589)
        at java.lang.Long.parseLong(Long.java:631)
        at org.openx.data.jsonserde.objectinspector.primitive.ParsePrimitiveUtils.parseLong(ParsePrimitiveUtils.java:49)
        at org.openx.data.jsonserde.objectinspector.primitive.JavaStringLongObjectInspector.get(JavaStringLongObjectInspector.java:46)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:400)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:279)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:239)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:201)
        at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:565)
        at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:395)
        ... 17 more

0 个答案:

没有答案