我正在使用Hadoop 2.7.3(使用AWS)运行Mahout 0.13.0作业。当我试图训练生成的向量时,它会抛出异常:
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
17/07/25 10:21:39 INFO Job: Task Id : attempt_1500972617227_0091_m_000008_2, Status : FAILED
Error: java.lang.IllegalArgumentException: Wrong numLabels: 0. Must be > 0!
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:44)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
我的学习输入数据格式如下,并且不包含任何空行:
event 308398661275111424 Book your hotel deals Hotel Mercure Rainbow Spend a couple nights at Hotel Mercure Rainbow in #deals
health 308215054011194849 Get $8 Off $150 @ Muscle & Strength #coupons #deals
art 309512129285853184 Marvel Superhero Assorted Graphic Novel 15-Pack for $38 + $4 s&h: Graveyard Mall offers this ... #Offer360 #Deals
apparel 308215054011197980 Febreze Lavender Vanilla and Comfort Fabric Refresher, 27.0-Ounce (Pack of 9): Febreze Fabric Ref... #deal #deals
tech 309513762744979456 DataMan Next : Track Data Usage In Real-Time for iPhone on Sale ($1.99 -> $0.99) #iphone #deal
home 308215054011203842 #discounts #deals Offer 10: Nespresso Pixie Espresso Maker, Red coffee espresso reviews Best Buy Price
tech 308215054011206111 Lenovo DEALS - $9 Lenovo P830 Headset HOT #Lenovo #deals #coupons
health 308381655717003265 Therapy Systems Retinol Cellular Treatment Cleanser / PM: Containing pharmaceutical grade microen... #deal #deals
tech 308215054011204704 Tena Serenity Absorbency Pads, Slender 30 ea #amazon #deals
camera 308328440174624768 #Canon Powershot A2200 14.1 MP #Digital #Camera with 4x Optical Zoom #photo #deals
运行作业的命令:
1. mahout seqdirectory -i /opt/function/input/functionData.csv -o /opt/function/output/
2. mahout seq2sparse -i /opt/function/output/ -o /opt/function/vector/
3. mahout split -i /opt/function/vector/tfidf-vectors --trainingOutput /opt/function/train-vectors --testOutput /opt/function/test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles -xm sequential
4. mahout trainnb -i /opt/function/vector/tfidf-vectors -o /opt/function/model -li /opt/function/labelindex -ow -c