我有一个生成数据的Perl脚本。
#! /usr/bin/perl -w
for ($i=0; $i<3; $i++)
{
$str1 = "";
$str2 = "";
$str3 = "";
$str4 = "";
$str5 = "";
$str6 = "";
$str7 = "";
$str8 = "";
$str9 = "";
@chars=('a'..'z','A'..'Z','_');
for(1..10){
$str1.=$chars[rand @chars];
$str2 =$chars[rand @chars];
$str3 =$chars[rand @chars];
$str4 =$chars[rand @chars];
$str5 =$chars[rand @chars];
$str6 =$chars[rand @chars];
$str7 =$chars[rand @chars];
$str8 =$chars[rand @chars];
$str9 =$chars[rand @chars];
}
print "$i:str1:str2:str3:str4:str5:str6:str7:str8:str9\n";
}
当我使用Hadoop Streaming
运行脚本时,如下所示:
#!/usr/bin
hadoop fs -rm -R /user/oracle/output
echo "Start time :" `date` >> run_time_perl_hadoop.log
hadoop jar /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop-0.20- mapreduce/contrib/streaming/hadoop-streaming.jar \
-D mapreduce.map.tasks=1 \
-input /user/oracle/perl_test/data_generator_hadoop_tarun.pl \
-output /user/oracle/output \
-mapper data_generator_hadoop_tarun.pl \
-file data_generator_hadoop_tarun.pl
echo "End time :" `date` >> run_time_perl_hadoop.log
它生成6行而不是3行。
知道为什么吗?