MapReduce如何检查两个表中的键是否存在左连接

时间:2016-10-17 13:21:52

标签: java hadoop join mapreduce

我有两种地图方法可以将它发送到我的reducer

键/值

number1 value1
number2 value2
number3 value3
number4 value10

和 键/值

number1 value4
number2 value5
number5 value6
number6 value7

所以减速器会像这样接收它们

number1 value1 value4
number2 value2 value5
number3 value3
number4 value10
number5 value6
number6 value7

如何仅向上下文添加表1中存在值的对?

也许有办法不将这些值传递给reducer?

number5 value6
number6 value7

并且只传递这些

number1 value1 value4
number2 value2 value5
number3 value3
number4 value10

我想在表1中的每个值的开头添加一个单词,并检查它是否存在于reducer中

for (Text value: values) {
    String[] stringValue = value.toString().split(" ");
    if (stringValue[0].equals("checkValue")) {
        boo = 1;

但这不起作用,代码中有太多不必要的if else

public class Reduce extends Reducer<Text, Text, Text, Text> {
    UtilClass utilClass = new UtilClass();
    String result = "";
    Text theKey = null;
    int boo = 0;

    @Override
    public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
        String string = "";
        // send only date type values
        HashMap<Text, Integer> m = new HashMap<>();
        // Populate the HashMap
        for (Text value : values) {
            String[] stringValue = value.toString().split(" ");
            if (stringValue[0].equals("cdr")) {
                boo = 1;
                Text element = new Text(utilClass.getPeriod(stringValue[1] + " " + stringValue[2]));
                if (m.get(element) == null) {
                    m.put(element, 1);
                } else {
                    m.put(element, m.get(element) + 1);
                }
            } else {
                if (string.equals("")) {
                    string = stringValue[1] + " " + stringValue[2] + " " + stringValue[3];
                } else if (stringValue[4].equals("A") || stringValue[4].equals("S")) {
                    string = stringValue[1] + " " + stringValue[2] + " " + stringValue[3];
                }
            }
        }

        if (boo == 1) {
            // Display the frequencies
            for (Map.Entry<Text, Integer> entry : m.entrySet()) {
                if (theKey != key) {
                    result = "";
                    theKey = key;
                }
                result += entry.getValue() + "_" + entry.getKey() + " ";
            }

            // add BAN_KEY; MARKET_KEY_SRC;ACCOUNT_TYPE_KEY
            result = string + " " + result;

            // add rest for every entry
            context.write(key, new Text(result));
        } else {
            context.write(key, new Text(result));
        }
    }
}

我正试着加入这些表格。我不能使用蜂巢和其他人。

0 个答案:

没有答案