输出加倍并且乱序

时间:2013-10-02 21:07:04

标签: python hadoop mapreduce

我正在使用Python编写一个简单的Hadoop程序。

mapper.py

#!/usr/bin/python
import sys
import numpy
from collections import OrderedDict

for line in sys.stdin:
        test = OrderedDict([('1', [11, 5, 5, 5, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]), ('2', [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 4, 0, 0, 0, 0, 1, 0, 0, 0, 29, 28, 18, 12, 11, 11, 10, 9, 9, 9, 8, 8, 8, 6, 6, 6, 5, 5, 4, 4])])
        for f in test:
                print numpy.asarray(test[f])

reducer.py

#!/usr/bin/python
import sys
for line in sys.stdin:
    print line,

输入文件

1
2

预期输出

[11  5  5  5  4  4  4  3  3  3  3  3  3  3  2  2  2  2  2  2  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0]
[ 0  0  0  0  0  0  0  0  1  0  3  4  0  0  0  0  1  0  0  0 29 28 18 12 11 11 10  9  9  9  8  8  8  6  6  6  5  5  4  4]
[11  5  5  5  4  4  4  3  3  3  3  3  3  3  2  2  2  2  2  2  0  0  0  0  0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0]
[0  0  0  0  0  0  0  0  1  0  3  4  0  0  0  0  1  0  0  0 29 28 18 12 11 11 10  9  9  9  8  8  8  6  6  6  5  5  4  4]

实际输出

  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0]  
  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0]  
 11 10  9  9  9  8  8  8  6  6  6  5  5  4  4]  
 11 10  9  9  9  8  8  8  6  6  6  5  5  4  4]  
[ 0  0  0  0  0  0  0  0  1  0  3  4  0  0  0  0  1  0  0  0 29 28 18 12 11 
[ 0  0  0  0  0  0  0  0  1  0  3  4  0  0  0  0  1  0  0  0 29 28 18 12 11 
[11  5  5  5  4  4  4  3  3  3  3  3  3  3  2  2  2  2  2  2  0  0  0  0  0 
[11  5  5  5  4  4  4  3  3  3  3  3  3  3  2  2  2  2  2  2  0  0  0  0  0

1 个答案:

答案 0 :(得分:0)

输出按字符串排序,您的字符串包含括号。您可以通过格式化字符串来解决此问题,如下所示:

print ', '.join(str(item) for item in numpy.asarray(test[f]))

您可以阅读thisthis其他SO问题了解更多详情。