MRJOB减速器在EMR上没有输出,但在本地机器上运行时提供输出

时间:2014-10-08 12:54:13

标签: hadoop emr mrjob

当我在本地设置上执行MapReduce作业时,我从reducer获得所需的输出,而EMR上的相同代码不会产生任何输出。我有1个主服务器和10个核心的集群设置。

这是输出。没有显示错误

Map-Reduce Framework
    Map input records=3000
    Map output records=378
    Map output bytes=36054
    Map output materialized bytes=40448
    Input split bytes=1420
    Combine input records=0
    Combine output records=0
    Reduce input groups=179
    Reduce shuffle bytes=40448
    Reduce input records=378
    Reduce output records=0
    Spilled Records=756
    Shuffled Maps =380
    Failed Shuffles=0
    Merged Map outputs=380
    GC time elapsed (ms)=23484
    CPU time spent (ms)=125780
    Physical memory (bytes) snapshot=9989242880
    Virtual memory (bytes) snapshot=52768247808
    Total committed heap usage (bytes)=6517702656
Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
File Input Format Counters 
    Bytes Read=711180681
File Output Format Counters 
    Bytes Written=0

遵循reducer代码:

def reducer(self, key, val):
    best = -60
    best_name = None
    lat = 0
    longi = 0
    yr = 0
    genre = None

    for hot, name,lat,longi,yr,genre in val:
        if hot > best:
            best = hot
            best_name = name
            lat = lat
            longi = longi
            yr = yr
            genre = genre

    yield (key,(best,best_name,lat,longi,yr,genre))

0 个答案:

没有答案