我对hadoop相当新,但是,我一直在阅读“Hadoop:权威指南”,所以我想我对基本概念有所了解。
我使用Hadoop 0.20.2来运行一个相当简单的工作,但是我得到以下异常:
java.lang.ArrayIndexOutOfBoundsException: 4096
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:127)
at java.io.DataInputStream.readInt(DataInputStream.java:373)
at com.convertro.mapreduce.WritableHit.readFields(Unknown Source)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria
lizer.deserialize(WritableSerialization.java:67)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria
lizer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.
java:116)
at org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceCo
ntext.java:163)
at com.convertro.mapreduce.HitConvertingIterable$HitConvertingIterator.n
ext(HitConvertingIterable.java:35)
at com.convertro.mapreduce.HitConvertingIterable$HitConvertingIterator.n
ext(HitConvertingIterable.java:1)
at com.convertro.naive.NaiveHitReducer.reduce(Unknown Source)
at com.convertro.mapreduce.HitReducer.reduce(Unknown Source)
at com.convertro.mapreduce.HitReducer.reduce(Unknown Source)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566
)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:2
这在读取WritableHit类(reduce阶段的输入值)期间发生。下面是WritableHit类代码:
public class WritableHit implements WritableComparable<WritableHit> {
private Hit hit;
public WritableHit() {
this(null);
}
public WritableHit(Hit hit) {
this.hit = hit;
}
@Override
public void readFields(DataInput input) throws IOException {
String clientName = input.readUTF();
String clientSiteId = input.readUTF();
String eventUniqueId = input.readUTF();
String eventValue = input.readUTF();
String pageRequested = input.readUTF();
String refererUrl = input.readUTF();
String uniqueHitId = input.readUTF();
String userAgent = input.readUTF();
String userIdentifier = input.readUTF();
String userIp = input.readUTF();
int timestamp = input.readInt();
int version = input.readInt();
hit = new Hit(version, uniqueHitId, clientName, clientSiteId, timestamp, userIdentifier,
userIp, pageRequested, refererUrl, userAgent, eventUniqueId, eventValue);
}
@Override
public void write(DataOutput output) throws IOException {
output.writeUTF(hit.getClientName());
output.writeUTF(hit.getClientSiteId());
output.writeUTF(hit.getEventUniqueId());
output.writeUTF(hit.getEventValue());
output.writeUTF(hit.getPageRequested());
output.writeUTF(hit.getRefererUrl());
output.writeUTF(hit.getUniqueHitId());
output.writeUTF(hit.getUserAgent());
output.writeUTF(hit.getUserIdentifier());
output.writeUTF(hit.getUserIp());
output.write(hit.getTimestamp());
output.write(hit.getVersion());
}
public Hit getHit() {
return hit;
}
@Override
public int compareTo(WritableHit o) {
return hit.getUniqueHitId().compareTo(o.getHit().getUniqueHitId());
}
}
非常感谢任何帮助。
由于
答案 0 :(得分:1)
我明白了。
显然,当你实现Writable对象时,你应该使用 writeInt 方法,而不是 write 方法。
一旦我做到了,它就像一个魅力。