Question

我是第一次在Hadoop中使用自定义数据类型。这是我的代码：

自定义数据类型：

public class TwitterData implements Writable {

private Long id;
private String text;
private Long createdAt;

public TwitterData(Long id, String text, Long createdAt) {
    super();
    this.id = id;
    this.text = text;
    this.createdAt = createdAt;
}

public TwitterData() {
    this(new Long(0L), new String(), new Long(0L));
}

@Override
public void readFields(DataInput in) throws IOException {
    System.out.println("In readFields...");
    id = in.readLong();
    text = in.readLine();
    createdAt = in.readLong();
}

@Override
public void write(DataOutput out) throws IOException {
    System.out.println("In write...");
    out.writeLong(id);
    out.writeChars(text);
    out.writeLong(createdAt);
}

public Long getId() {
    return id;
}

public void setId(Long id) {
    this.id = id;
}

public String getText() {
    return text;
}

public void setText(String text) {
    this.text = text;
}

public Long getCreatedAt() {
    return createdAt;
}

public void setCreatedAt(Long createdAt) {
    this.createdAt = createdAt;
}
}

Mapper：

public class Map extends Mapper<Object, BSONObject, Text, TwitterData>{

@Override
public void map(Object key, BSONObject value, Context context) throws IOException, InterruptedException {
    BSONObject user = (BSONObject) value.get("user");
    String location = (String) user.get("location");

    TwitterData twitterData = new TwitterData((Long) value.get("id"),
            (String) value.get("text"), (Long) value.get("createdAt"));

    if(location.toLowerCase().indexOf("india") != -1) {
        context.write(new Text("India"), twitterData);
    } else {
        context.write(new Text("Other"), twitterData);
    }
}
}

主要职位代码：

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(TwitterData.class);

我在映射过程后抛出此异常。我很糟糕，为什么它显示这个错误。谁能帮帮我吗。提前谢谢。

Answer 1

你写字符，你读行。这是两个不同的序列化过程。

你需要做的是这样做：

@Override
public void readFields(DataInput in) throws IOException {
    id = in.readLong();
    text = in.readUTF();
    createdAt = in.readLong();
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeLong(id);
    out.writeUTF(text);
    out.writeLong(createdAt);
}

使用Hadoop自定义数据类型时出现EOF异常

1 个答案: