Got exception running Sqoop: java.lang.NullPointerException using -query and --as-parquetfile

时间:2015-06-25 18:15:23

标签: hadoop sqoop parquet

I am trying to import a table data from Redshift to HDFS (using Parquet format) and facing the error shown below:

15/06/25 11:05:42 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:97)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

Command line query used:

sqoop import --driver "com.amazon.redshift.jdbc41.Driver" --connect "jdbc:postgresql://:5439/events" --username "username" --password "password" --query "SELECT * FROM mobile_og.pages WHERE \$CONDITIONS" --split-by anonymous_id --target-dir /user/huser/pq_mobile_og_pages_2 --as-parquetfile.

It works fine when --as-parquetfile option is removed from the above command line query.

1 个答案:

答案 0 :(得分:2)

确认是一个错误SQOOP-2571

如果要导入表的所有数据,则最终可以运行以下命令:

sqoop import --driver "com.amazon.redshift.jdbc41.Driver" \
  --connect "jdbc:postgresql://:5439/events" \
  --username "username" --password "password" \
  --table mobile_og.pages \
  --split-by anonymous_id \
  --target-dir /user/huser/pq_mobile_og_pages_2 \
  --as-parquetfile

--where也是一个有用的参数。检查user manual