最近,我正在学习MapReduce并使用它将数据写入MySQL数据库。有两种方法可以做到这一点, DBOutputFormat 和 SQOOP 。我尝试了第一个(参考here),但遇到了问题,以下是错误:
...
16/05/25 09:36:53 INFO mapred.LocalJobRunner: 3 / 3 copied.
16/05/25 09:36:53 INFO mapred.LocalJobRunner: reduce task executor complete.
16/05/25 09:36:53 WARN output.FileOutputCommitter: Output Path is null in cleanupJob()
16/05/25 09:36:53 WARN mapred.LocalJobRunner: job_local1404930626_0001
java.lang.Exception: java.io.IOException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.io.IOException
at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat.getRecordWriter(DBOutputFormat.java:185)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:540)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/05/25 09:36:54 INFO mapreduce.Job: Job job_local1404930626_0001 failed with state FAILED due to: NA
16/05/25 09:36:54 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=32583
FILE: Number of bytes written=796446
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=402
HDFS: Number of bytes written=0
HDFS: Number of read operations=18
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
...
当我手动使用JDBC连接和插入数据时,结果证明是成功的。我注意到map / reduce任务执行程序已完成,但遇到IOException。所以我猜这个问题与数据库有关。
我的代码是here。如果有人可以帮我弄清问题是什么,那就适用。
提前致谢!