迭代值时,Hadoop 0.20.2 reducer会抛出ArrayIndexOutOfBoundsException

时间:2011-10-01 15:48:10

标签: hadoop

我对hadoop相当新,但是,我一直在阅读“Hadoop:权威指南”,所以我想我对基本概念有所了解。

我使用Hadoop 0.20.2来运行一个相当简单的工作,但是我得到以下异常:

java.lang.ArrayIndexOutOfBoundsException: 4096
        at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:127)
        at java.io.DataInputStream.readInt(DataInputStream.java:373)
        at com.convertro.mapreduce.WritableHit.readFields(Unknown Source)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria
lizer.deserialize(WritableSerialization.java:67)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria
lizer.deserialize(WritableSerialization.java:40)
        at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.
java:116)
        at org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceCo
ntext.java:163)
        at com.convertro.mapreduce.HitConvertingIterable$HitConvertingIterator.n
ext(HitConvertingIterable.java:35)
        at com.convertro.mapreduce.HitConvertingIterable$HitConvertingIterator.n
ext(HitConvertingIterable.java:1)
        at com.convertro.naive.NaiveHitReducer.reduce(Unknown Source)
        at com.convertro.mapreduce.HitReducer.reduce(Unknown Source)
        at com.convertro.mapreduce.HitReducer.reduce(Unknown Source)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566
)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:2

这在读取WritableHit类(reduce阶段的输入值)期间发生。下面是WritableHit类代码:

 public class WritableHit implements WritableComparable<WritableHit> {

    private Hit hit;

    public WritableHit() {
        this(null);
    }

    public WritableHit(Hit hit) {
        this.hit = hit;
    }

    @Override
    public void readFields(DataInput input) throws IOException {
        String clientName = input.readUTF();
        String clientSiteId = input.readUTF();
        String eventUniqueId = input.readUTF();
        String eventValue = input.readUTF();
        String pageRequested = input.readUTF();
        String refererUrl = input.readUTF();
        String uniqueHitId = input.readUTF();
        String userAgent = input.readUTF();
        String userIdentifier = input.readUTF();
        String userIp = input.readUTF();
        int timestamp = input.readInt();
        int version = input.readInt();

        hit = new Hit(version, uniqueHitId, clientName, clientSiteId, timestamp, userIdentifier, 
                userIp, pageRequested, refererUrl, userAgent, eventUniqueId, eventValue);
    }

    @Override
    public void write(DataOutput output) throws IOException {
        output.writeUTF(hit.getClientName());
        output.writeUTF(hit.getClientSiteId());
        output.writeUTF(hit.getEventUniqueId());
        output.writeUTF(hit.getEventValue());
        output.writeUTF(hit.getPageRequested());
        output.writeUTF(hit.getRefererUrl());
        output.writeUTF(hit.getUniqueHitId());
        output.writeUTF(hit.getUserAgent());
        output.writeUTF(hit.getUserIdentifier());
        output.writeUTF(hit.getUserIp());
        output.write(hit.getTimestamp());
        output.write(hit.getVersion());
    }

    public Hit getHit() {
        return hit;
    }

    @Override
    public int compareTo(WritableHit o) {
        return hit.getUniqueHitId().compareTo(o.getHit().getUniqueHitId());
    }
    }

非常感谢任何帮助。

由于

1 个答案:

答案 0 :(得分:1)

我明白了。

显然,当你实现Writable对象时,你应该使用 writeInt 方法,而不是 write 方法。

一旦我做到了,它就像一个魅力。