如何在目标数据库中指定sqoop导出列?

时间:2015-03-13 12:58:01

标签: postgresql-9.3 avro sqoop2

我想使用sqoop(2)导出从avro文件填充postgres表,但我在源代码中没有id字段,应该自动填充(串行类型)但是我收到错误。

表DDL:

CREATE TABLE test
 (
 id serial primary key,
 partner_id varchar,
 column1 varchar,
 column2 varchar
)

avro架构:

{
"namespace": "avro_test",
"type": "record",
"name": "test",
"fields": [
      {"name": "partner_id", "type": "string"},
      {"name": "column1", "type": ["string","null"]},
      {"name": "column2", "type": ["string","null"]}
      ]
}
我使用

导出命令:

./sqoop-1.4.5.bin__hadoop-2.0.4-alpha/bin/sqoop export \
    --connect jdbc:postgresql://host/db \
    --username user_test --password pass_test \
    --table test \
    --export-dir path \
    --columns partner_id,column1,column2

但是我收到一个错误,我在avro架构中没有id:

Status : FAILED
Error: java.io.IOException: Cannot find field id in Avro schema

我尝试使用--columns参数指定目标列,但它不起作用。我如何加载上面的avro文件?

如果我从表中删除id字段,它会成功导出

提前致谢

1 个答案:

答案 0 :(得分:0)

简单的解决方案是在您的avro模式中添加ID,默认情况下它将为null

{
  "namespace": "avro_test",
  "type": "record",
  "name": "test",
  "fields": [
        {"name": "id", "type": ["null", "int"]},
        {"name": "partner_id", "type": "string"},
        {"name": "column1", "type": ["string","null"]},
        {"name": "column2", "type": ["string","null"]}
  ]

}

通过Sqoop导出到mysql时,它会自动填充主键“ Id”。希望这会有所帮助!