如果我有一个名为MAxTemperatureMapper的映射器类,并且当我将带有100条记录的文件的TextInputFormat类作为输入时。 是创建的映射器实例的数量等于输入拆分的数量,还是为输入中的每个键值对创建的新映射器。?
答案 0 :(得分:1)
Number of mapper would be equal to number of input splits .
when ever you submit a job ,first of all it determines number of splits ,splits are logical .
usually one split size is equal to hdfs block size but that can also be configured where your split size could be less than or greater than block size . for efficent processing usually one split size is equal to block size .
suppose you have file of 1 GB ,your default block size is 128 mb so approximately you will have 8 blocks and 8 input splits would be required and hence 8 mappers would be invoked for this process .