面对sqoop-export中的一些问题?

时间:2012-03-29 11:31:09

标签: sql-server hadoop hive sqoop

我已经从Hive导出了很多次表到SQL Server。我从不面对这个问题。

我使用字段分隔符作为","并在SQL Server中创建了一个表。

hadoop@ubuntu:~/sqoop-1.3.0-cdh3u1/bin$ ./sqoop-export --connect 'jdbc:sqlserver://192.168.1.1;username=abcd;password=12345;database=HadoopTest' --table tmptempmeasurereport --export-dir /user/hive/warehouse/tmptempmeasurereport

12/03/29 16:20:21 INFO SqlServer.MSSQLServerManagerFactory: Using Microsoft's SQL Server - Hadoop Connector
12/03/29 16:20:21 INFO manager.SqlManager: Using default fetchSize of 1000
12/03/29 16:20:21 INFO tool.CodeGenTool: Beginning code generation
12/03/29 16:20:21 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport]
12/03/29 16:20:21 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport]
12/03/29 16:20:21 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-0.20.2-cdh3u2
12/03/29 16:20:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/1c5aae88cd7daca66aa665d4bab5b470/tmptempmeasurereport.jar
12/03/29 16:20:22 INFO mapreduce.ExportJobBase: Beginning export of tmptempmeasurereport
12/03/29 16:20:22 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport]
12/03/29 16:20:22 WARN mapreduce.ExportJobBase: IOException checking SequenceFile header: java.io.EOFException
12/03/29 16:20:23 INFO input.FileInputFormat: Total input paths to process : 2
12/03/29 16:20:23 INFO input.FileInputFormat: Total input paths to process : 2
12/03/29 16:20:23 INFO mapred.JobClient: Running job: job_201203291108_0645
12/03/29 16:20:24 INFO mapred.JobClient:  map 0% reduce 0%
12/03/29 16:20:29 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_0, Status : FAILED
java.util.NoSuchElementException
    at java.util.AbstractList$Itr.next(AbstractList.java:350)
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383)
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
    at org.apache.hadoop.mapred.Child.main(Child.java:264)

12/03/29 16:20:34 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_1, Status : FAILED
java.util.NoSuchElementException
    at java.util.AbstractList$Itr.next(AbstractList.java:350)
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383)
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
    at org.apache.hadoop.mapred.Child.main(Child.java:264)

12/03/29 16:20:38 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_2, Status : FAILED
java.util.NoSuchElementException
    at java.util.AbstractList$Itr.next(AbstractList.java:350)
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383)
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79)
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
    at org.apache.hadoop.mapred.Child.main(Child.java:264)

12/03/29 16:20:43 INFO mapred.JobClient: Job complete: job_201203291108_0645
12/03/29 16:20:43 INFO mapred.JobClient: Counters: 7
12/03/29 16:20:43 INFO mapred.JobClient:   Job Counters
12/03/29 16:20:43 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=18742
12/03/29 16:20:43 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/03/29 16:20:43 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/03/29 16:20:43 INFO mapred.JobClient:     Launched map tasks=4
12/03/29 16:20:43 INFO mapred.JobClient:     Data-local map tasks=4
12/03/29 16:20:43 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
12/03/29 16:20:43 INFO mapred.JobClient:     Failed map tasks=1
12/03/29 16:20:43 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 21.0326 seconds (0 bytes/sec)
12/03/29 16:20:43 INFO mapreduce.ExportJobBase: Exported 0 records.
12/03/29 16:20:43 ERROR tool.ExportTool: Error during export: Export job failed!

[我的膀胱是as-hadoop-0.20.2-cdh3, sqoop-1.3.0-cdh3u1, 蜂房0.7.1]

我做错了吗?请帮助我解决这个问题。

非常感谢。

3 个答案:

答案 0 :(得分:3)

我建议你在sqoop命令中添加 - fields-terminated-by - lines-terminated-by 选项。

答案 1 :(得分:2)

如果我导出的表具有文件中不存在的其他列,则会出现此错误。如果检查自动生成的tmptempmeasurereport.java,您将看到Sqoop正在使用的逻辑。

答案 2 :(得分:2)

我通过删除文本输入文件中最后一条记录末尾的\n来修复此错误。

  • "1,this,42\n2,that,100\n" - 失败
  • "1,this,42\n2,that,100" - 作品