我正在研究Hadoop基准测试并使用teragen和tera排序工具。
teragen工具工作正常,我正在使用以下命令:
hadoop jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar teragen -Dmapreduce.job.maps=100 1t random-data1
并在控制台上提供以下输出:
17/10/03 17:19:21 INFO mapreduce.Job: Job job_1507026170114_0005 completed successfully
17/10/03 17:19:21 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=10661490
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=8594
HDFS: Number of bytes written=0
HDFS: Number of read operations=400
HDFS: Number of large read operations=0
HDFS: Number of write operations=200
Job Counters
Launched map tasks=100
Other local map tasks=100
Total time spent by all maps in occupied slots (ms)=1089472
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=1089472
Total vcore-milliseconds taken by all map tasks=1089472
Total megabyte-milliseconds taken by all map tasks=1115619328
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=8594
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=9690
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=11115954176
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
在此之后,当我使用以下命令执行terasort工具时:
hadoop jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort random-data1 sorted-data
我收到以下错误:
17/10/03 17:20:10 INFO terasort.TeraSort: starting
17/10/03 17:20:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/03 17:20:11 INFO input.FileInputFormat: Total input paths to process : 100
Spent 168ms computing base-splits.
Spent 2ms computing TeraScheduler splits.
Computing input splits took 172ms
Sampling 10 splits of 100
Making 1 from 0 sampled records
17/10/03 17:20:11 ERROR terasort.TeraSort: Requested more partitions than input keys (1 > 0)
任何帮助,为什么会发生这种情况?配置部分中是否缺少任何内容?
答案 0 :(得分:0)
检查teragen命令的输出,因为生成的输出文件可能为空。如果输入数据大小为0,则会出现此错误。