Hadoop - 插入mysql时无法捕获ioexception

时间:2014-01-08 08:17:24

标签: java mysql hadoop mapreduce ioexception

我使用Hadoop 1.2.1并仅执行“地图”#39; job基本上将日志条目映射到mysql表中。其中一个提取字段是ip地址,有时它比表中的列长度太长而发生IOException。尽管map函数中有try-catch子句,但我无法捕获并处理它。代码如下:

public class LogEntriesMapper extends
    Mapper<Object, Text, LogEntry, NullDBWritable> {

private static Pattern p1 = Pattern
        .compile([…]);

private final static NullDBWritable nullDB = new NullDBWritable();
private LogEntry logEntry = new LogEntry();

@Override
protected void setup(Context context) throws IOException,
        InterruptedException {
    super.setup(context);
    p1 = Pattern.compile([...);
}

@Override
protected void map(Object key, Text value, Context context)
        throws IOException, InterruptedException {
    String entry = value.toString();

    Matcher matcher = p1.matcher(entry);

    if (matcher.find()) {

        String date = ...
        String ip = ...
        [extracting fields]

        logEntry.setDate(date);
        logEntry.setIp(ip);
        logEntry.setClient(client);
        logEntry.setSession(session);
        logEntry.setReal_time(real_time);

        try {
            context.write(logEntry, nullDB);
        } catch (IOException e) {
            System.out.println("Failed to save entry: " + logEntry);
            System.out.println(e.getMessage());
        }
    }
}
}

和syslog:

  

2014-01-07 15:38:08,908 WARN org.apache.hadoop.util.NativeCodeLoader:无法为您的平台加载native-hadoop库...使用适用的builtin-java类

     

2014-01-07 15:38:09,233 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:源名称ugi已经存在!

     

2014-01-07 15:38:09,330 INFO org.apache.hadoop.mapred.Task:使用ResourceCalculatorPlugin:null

     

2014-01-07 15:38:09,354 INFO org.apache.hadoop.mapred.MapTask:处理拆分:hdfs:// master:9000 / logs / 20130718.txt:0 + 49101

     

2014-01-07 15:38:09,672 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader:已加载的原生gpl库

     

2014-01-07 15:38:09,675 INFO com.hadoop.compression.lzo.LzoCodec:已成功加载&amp;初始化的本土 - lzo库[hadoop-lzo rev fbd3aa777e0ad06bce75c6aff8c91c7c68eb596b]

     

2014-01-07 15:38:09,788 WARN org.apache.hadoop.mapreduce.lib.db.DBOutputFormat:com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:连接关闭后不允许任何操作。       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)       at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)       at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)       at java.lang.reflect.Constructor.newInstance(Constructor.java:526)       在com.mysql.jdbc.Util.handleNewInstance(Util.java:411)       在com.mysql.jdbc.Util.getInstance(Util.java:386)       在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1015)       在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)       在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)       在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)       在com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1304)       在com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1296)       在com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:5028)       at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat $ DBRecordWriter.close(DBOutputFormat.java:98)       at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:650)       在org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1793)       在org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)       在org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)       在org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)       at java.security.AccessController.doPrivileged(Native Method)       在javax.security.auth.Subject.doAs(Subject.java:415)       在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)       在org.apache.hadoop.mapred.Child.main(Child.java:249)

     

2014-01-07 15:38:09,789 INFO org.apache.hadoop.mapred.MapTask:在关闭org.apache.hadoop.mapred.MapTask $NewDirectOutputCollector@6dac133e时忽略异常   java.io.IOException:语句关闭后不允许任何操作。       at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat $ DBRecordWriter.close(DBOutputFormat.java:103)       at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:650)       在org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1793)       在org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)       在org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)       在org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)       at java.security.AccessController.doPrivileged(Native Method)       在javax.security.auth.Subject.doAs(Subject.java:415)       在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)       在org.apache.hadoop.mapred.Child.main(Child.java:249)

     

2014-01-07 15:38:09,823 INFO org.apache.hadoop.mapred.TaskLogsTruncater:初始化日志&#39;截断器,mapRetainSize = -1,reduceRetainSize = -1

     

2014-01-07 15:38:09,855错误org.apache.hadoop.security.UserGroupInformation:PriviledgedActionException as:mateuszmurawski cause:java.io.IOException:数据截断:列数据太长了#39 ; IP&#39;在第1行

     

2014-01-07 15:38:09,855 WARN org.apache.hadoop.mapred.Child:运行子项时出错   java.io.IOException:数据截断:列的数据太长了&#39; ip&#39;在第1行       at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat $ DBRecordWriter.close(DBOutputFormat.java:103)       at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:650)       在org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)       在org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)       在org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)       at java.security.AccessController.doPrivileged(Native Method)       在javax.security.auth.Subject.doAs(Subject.java:415)       在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)       在org.apache.hadoop.mapred.Child.main(Child.java:249)

     

2014-01-07 15:38:09,861 INFO org.apache.hadoop.mapred.Task:Runnning cleanup for the task

0 个答案:

没有答案