SQOOP - 代码太大> MAX表定义?

时间:2013-11-07 14:04:04

标签: hadoop bigdata teradata sqoop

我正在尝试从具有2000列的TERADATA表中将数据导入HDFS(表定义生成90K字符)...当我执行我的脚本时,我得到:

/tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java:21971: code too large

我的sqoop脚本:

sqoop import \
 -libjars $LIB_JARS \
 --connect jdbc:teradata://PRD/Database=database \
 --connection-manager org.apache.sqoop.teradata.TeradataConnManager \
 --table table \
 --username login \
 --password pass \

我的输出日志:

13/11/07 14:54:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/11/07 14:54:50 INFO manager.SqlManager: Using default fetchSize of 1000
13/11/07 14:54:50 INFO tool.CodeGenTool: Beginning code generation
13/11/07 14:55:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
13/11/07 14:55:46 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop/libexec/..
13/11/07 14:55:46 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop/libexec/../hadoop-core.jar
/tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java:21971: code too large
  public boolean equals(Object o) {
                 ^
/tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java:37949: code too large
  public void write(DataOutput __dataOut) throws IOException {
              ^
/tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java:49925: code too large
  public String toString(DelimiterSet delimiters, boolean useRecordDelim) {
                ^
/tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java:53970: code too large
  private void __loadFromFields(List<String> fields) {
               ^
Note: /tmp/sqoop-hadoopi/compile/636c527afc3baa6fdf33464f02430602/table.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
4 errors
13/11/07 14:55:51 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error returned by javac
        at org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:205)
        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:390)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

也许有人已经进口了一张大桌子...... 非常感谢!

2 个答案:

答案 0 :(得分:1)

Java中的每个方法都限制为64KB的字节码。我担心当前版本的Sqoop没有设施可以将你的案例中生成的长方法分解为多个子方法,所以我建议在Sqoop JIRA上打开一个新的功能请求。

答案 1 :(得分:0)

我不知道你是否已经尝试过这个,但是有用于Hadoop的Teradata Connector:

http://developer.teradata.com/connectivity/articles/teradata-connector-for-hadoop-now-available