Question

我有以下示例数据，我正在使用它来学习hadoop mapreduce。例如，它是跟随者和跟随者的数据。

Follower,followee   
    a,b
    a,c
    a,d
    c,b
    b,d
    d,a
    b,c
    b,e
    e,f

就像a跟随b，a跟随c等等......

我正在尝试操纵数据并获得结果，如果a跟随b而b也跟随a，则b应该是输出txt文件中的结果。我是新来的地图减少并试图找到一种方式，以便我可以得到以下结果。

 a,d
 c,b

Answer 1

你可以通过一招来实现这一目标。

诀窍是将键传递给reducer，使得（a，d）和（d，a）具有相同的键并最终在同一个reducer中：

当（a，d）到来时：

JQuery

当（d，a）来时：

'a' < 'd', hence emit:
key => a,d
value => a,d

键的形成方式总是在较高的字母表之前出现较低的字母。因此，对于这两个记录，关键是＆＃34; a，d＆＃34;

因此mapper的输出将为：

'd' > 'a', hence emit:
key => a,d
value => d,a

现在，在Reducers中，记录将按以下顺序到达：

Record: a,b
Key = a,b  Value = a,b

Record: a,c
Key = a,c  Value = a,c

Record: a,d
Key = a,d  Value = a,d

Record: c,b
Key = b,c  Value = c,b

Record: b,d
Key = b,d  Value = b,d

Record: d,a
Key = a,d  Value = d,a

Record: b,c
Key = b,c  Value = b,c

Record: b,e
Key = b,e  Value = b,e

Record: e,f
Key = e,f  Value = e,f

因此，在reducer中，您只需解析记录3和4：

Record 1: 
    Key = a,b  Value = a,b

Record 2: 
    Key = a,c  Value = a,c

Record 3: 
    Key = a,d  Value = a,d
    Key = a,d  Value = d,a

Record 4: 
    Key = b,c  Value = c,b
    Key = b,c  Value = b,c

Record 5: 
    Key = b,d  Value = b,d

Record 6: 
    Key = b,e  Value = b,e

Record 7: 
    Key = e,f  Value = e,f

因此，输出将是：

Record 3: 
    Key = a,d  Value = a,d
    Key = a,d  Value = d,a

Record 4: 
    Key = b,c  Value = c,b
    Key = b,c  Value = b,c

即使你有名字而不是字母，这个逻辑也会有效。对于例如你需要在mapper中使用以下逻辑（其中s1是第一个字符串，s2是第二个字符串）：

a,d
c,b

所以，如果你有：

String key = "";
int compare = s1.compareToIgnoreCase(s2);
if(compare >= 0)
    key = s1 + "," + s2;
else
    key = s2 + "," + s1;

密钥将是：

String s1 = "Stack";
String s2 = "Overflow";

同样，如果你有：

Stack,Overflow

仍然，关键是：

s1 = "Overflow";
s2 = "Stack";

hadoop mapreduce反之亦然

1 个答案: