我的python udf返回一个元组列表,如下所示:
[(0.01, 12), (0.02, 6), (0.03, 12), (0.04, 19), (0.05, 29), (0.06, 42)]
以上内容已打印到映射器的标准输出中,并从中复制。
元组中的两个值分别转换为float和int。我还打印了该类型,并且确实正确地进行了强制转换。
(<type 'float'>, <type 'int'>)
这里是装饰器@outputSchema("stats:bag{improvement:tuple(percent:float,entityCount:int)}")
这是错误消息:
错误:java.io.IOException: org.apache.avro.file.DataFileWriter $ AppendWriteException: java.lang.RuntimeException:基准(0.01,12)未合并 [“ null”,{“ type”:“记录”,“ name”:“ TUPLE_1”,“ fields”:[{“ name”:“ percent”,“ type”:[“ null”,“ float”], “ doc”:“自动生成 来自猪场 模式“,”默认“:null},{”名称“:” entityCount“,”类型“:[” null“,” int“],” doc“:”自动生成 来自Pig Field Schema“,” default“:null}]}]] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce $ Reduce.runPipeline(PigGenericMapReduce.java:479) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce $ Reduce.processOnePackageOutput(PigGenericMapReduce.java:442) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce $ Reduce.reduce(PigGenericMapReduce.java:422) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce $ Reduce.reduce(PigGenericMapReduce.java:269) 在org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)处 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) 在org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)处 org.apache.hadoop.mapred.YarnChild $ 2.run(YarnChild.java:164)在 java.security.AccessController.doPrivileged(本机方法),位于 javax.security.auth.Subject.doAs(Subject.java:422)在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) 在org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)导致 创建人:org.apache.avro.file.DataFileWriter $ AppendWriteException: java.lang.RuntimeException:基准(0.01,12)未合并 [“ null”,{“ type”:“记录”,“ name”:“ TUPLE_1”,“ fields”:[{“ name”:“ percent”,“ type”:[“ null”,“ float”], “ doc”:“自动生成 来自猪场 模式“,”默认“:null},{”名称“:” entityCount“,”类型“:[” null“,” int“],” doc“:”自动生成 来自Pig Field Schema“,” default“:null}]}]] org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)在 org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49) 在 org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:646) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat $ PigRecordWriter.write(PigOutputFormat.java:136) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat $ PigRecordWriter.write(PigOutputFormat.java:95) 在 org.apache.hadoop.mapred.ReduceTask $ NewTrackingRecordWriter.write(ReduceTask.java:558) 在 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) 在 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer $ Context.write(WrappedReducer.java:105) 在 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce $ Reduce.runPipeline(PigGenericMapReduce.java:477) ... 11更多原因:java.lang.RuntimeException:基准(0.01,12)是 不联合 [“ null”,{“ type”:“记录”,“ name”:“ TUPLE_1”,“ fields”:[{“ name”:“ percent”,“ type”:[“ null”,“ float”], “ doc”:“自动生成 来自猪场 模式“,”默认“:null},{”名称“:” entityCount“,”类型“:[” null“,” int“],” doc“:”自动生成 来自Pig Field Schema“,” default“:null}]}]] org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.resolveUnion(PigAvroDatumWriter.java:132) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.writeUnion(PigAvroDatumWriter.java:111) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:82) 在 org.apache.avro.generic.GenericDatumWriter.writeArray(GenericDatumWriter.java:131) 在 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:68) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.writeUnion(PigAvroDatumWriter.java:113) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:82) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.writeRecord(PigAvroDatumWriter.java:378) 在 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) 在 org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99) 在 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) 在org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257) ...
有人知道我在模式中做错了什么吗?