Map Reduce Programming:MultiMap会覆盖现有的键值

时间:2015-11-21 18:55:49

标签: join mapreduce multimap

我在MapReduce中做了Join算法。在Map阶段,我将joinColumn作为键,将元组作为值。在reduce方法中,我有键和值为(columnname,row)。在减少阶段,我需要分离"行"根据他们所属的表格分为两个。

我使用MultiMap来做到这一点。但是MultiMap正在覆盖现有的价值。为了克服这个问题,我重写了#34; equals"和"哈希码"但这并没有解决问题。

public void reduce(Text key,Iterable<Text> values,Context context) throws IOException, InterruptedException{

    Multimap<String,Table> entry=LinkedListMultimap.create();
    for(Text val : values){
        String[] row=val.toString().split(",");
        Table t = new Table();
        t.setTablename(row[0]);
        t.setColumns(val);
        entry.put(row[0],t);
    }
    for (String k: entry.keySet()){
        System.out.println("Key  : "+k);
        Collection<Table> rows=entry.get(k);
        Iterator<Table> i=rows.iterator();
        while(i.hasNext()){
            Table t=i.next();
            System.out.println(t.getColumns());
        }
    }
public class Table {
    private String tablename;
    private Text columns;
    public String getTablename() {
        return tablename;
    }
    public void setTablename(String tablename) {
        this.tablename = tablename;
    }
    public Text getColumns() {
        return columns;
    }
    public void setColumns(Text columns) {
        this.columns = columns;
    }
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((columns == null) ? 0 : columns.hashCode());
        result = prime * result
                + ((tablename == null) ? 0 : tablename.hashCode());
        return result;
    }
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Table other = (Table) obj;
        if (columns == null) {
            if (other.columns != null)
                return false;
        } else if (!columns.equals(other.columns))
            return false;
        if (tablename == null) {
            if (other.tablename != null)
                return false;
        } else if (!tablename.equals(other.tablename))
            return false;
        return true;
    }
}

我得到以下输出:

Key  : S
R, 2, Don, Larson, Newark, 555-3221
R, 2, Don, Larson, Newark, 555-3221
Key  : R
R, 2, Don, Larson, Newark, 555-3221
Key  : S
R, 3, Sal, Maglite, Nutley, 555-6905
R, 3, Sal, Maglite, Nutley, 555-6905
Key  : R
R, 3, Sal, Maglite, Nutley, 555-6905
Key  : R
S, 4, 22000, 7000, part1
Key  : S
S, 4, 22000, 7000, part1

它覆盖了现有的价值观。任何人都可以帮我解决这个问题吗?

1 个答案:

答案 0 :(得分:1)

您的问题是迭代值返回的对象由迭代器重用。您需要复制它,而不是仅仅在setColumns()中指定值。类似的东西:

public void setColumns(Text columns) {
    this.columns = new Text(columns.toString());
}