如何测试将键发送为null的Mapper,即context.write(null,<somevalue>); </somevalue>

时间:2014-11-07 10:07:42

标签: mapreduce mrunit

我有一个mapreduce程序,只有mapper而且没有reducer set。我想测试一下。我有以下测试代码

@Test
    public void testMapper() throws IOException {

      mapDriver.withInput(new LongWritable(0l), new Text(
              "af00bac654249b9d27982f19064338f4,54.0258822077885,-1.56832133466378,20121022,105507,026542913532,2093,87"));
      mapDriver.withOutput(null, [some value]);
      mapDriver.runTest();
    }

调用mapDriver.withOutput(null,[some value]);这一行是在异常下面抛出

显示java.lang.NullPointerException     在org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:58)     在org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)     在org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)

Note: Mapper Generic Signature Mapper<LongWritable, Text, Void, GenericRecord>

有人可以告诉我如何编写mapper的测试场景吗?

如果我做Nullwritable.get然后我得到例外如下 显示java.lang.NullPointerException     在org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:73)     在org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)     在org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)     在org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)     在org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612)     在org.apache.hadoop.mrunit.TestDriver.addOutput(TestDriver.java:118)     在org.apache.hadoop.mrunit.TestDriver.withOutput(TestDriver.java:138)     在com.gfk.gxl.etl.common.ExtractCSVTest.testMapper(ExtractCSVTest.java:73)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

似乎更像MRUnit with Avro NullPointerException in Serialization 但答案并不是解决我的问题

 with few more research i have below update
    class org.apache.avro.generic.GenericData$Record is not able to get serializer and deserializer
    in org.apache.hadoop.mrunit.internal.io.Serialization and both are coming as null which is causing the null pointer exception



 From API code snippet  for org.apache.hadoop.mrunit.internal.io.Serialization starting at line      no 61 to 70

  try {
      serializer = (Serializer<Object>) serializationFactory
          .getSerializer(clazz);
      deserializer = (Deserializer<Object>) serializationFactory
          .getDeserializer(clazz);
    } catch (NullPointerException e) {
      throw new IllegalStateException(
          "No applicable class implementing Serialization in conf at io.serializations for "
              + orig.getClass(), e);
    }
above method serializer \ deserializer  are coming null . do we have some way to avoid it

2 个答案:

答案 0 :(得分:2)

使用NullWritable.get()方法。希望这有帮助。

答案 1 :(得分:0)

不幸的是,虽然Hadoop可以接受null键,但你现在不能在MRUnit中使用null键,MRUnit团队计划将来支持null键,请参阅此处allow null keys and values as output, expected output