sqoop hive导入分区

时间:2017-02-22 18:25:03

标签: hadoop hive sqoop parquet partition

我有一些sqoop作业导入到我要分区的配置单元,但我无法让它运行。导入实际上是有效的:表是sqooped,它在hive中可见,有数据,但是当我描述表时,我期望看到的分区参数不会出现。我已经将这个表作为csv进行了平滑,创建了一个外部镶木桌子,并将数据插入其中(有效),但我希望能够尽可能避免额外的步骤。这是我目前的代码。我错过了什么,还是我想做不可能的事?谢谢!

sqoop import -Doraoop.import.hint=" " \
--options-file /home/[user]/pass.txt \
--verbose \
--connect jdbc:oracle:thin:@ldap://oid:389/cn=OracleContext,dc=[employer],dc=com/SQSOP051 \
--username [user]\
--num-mappers 10 \
--hive-import \
--query "select DISC_PROF_SK_ID, CLM_RT_DISC_IND, EASY_PAY_PLN_DISC_IND, TO_CHAR(L40_ATOMIC_TS,'YYYY') as YEAR, TO_CHAR(L40_ATOMIC_TS,'MM') as MONTH from ${DataSource[index]}.$TableName where \$CONDITIONS" \
--hive-database [dru_user] \
--hcatalog-partition-keys YEAR \
--hcatalog-partition-values '2015' \
--target-dir hdfs://nameservice1/data/res/warehouse/finance/[dru_user]/Claims_Data/$TableName \
--hive-table $TableName'testing' \
--split-by ${SplitBy[index]} \
--delete-target-dir \
--direct \
--null-string '\\N' \
--null-non-string '\\N' \
--as-parquetfile \

1 个答案:

答案 0 :(得分:0)

您可以将options-file替换为--password-file。但是,这不会解决分区问题。对于分区问题,您可以尝试在导入之前先创建分区编辑的表$TableName

sqoop import -Doraoop.import.hint=" "               \
  --password-file /home/[user]/pass.txt             \
  --verbose                                         \
  --connect jdbc:oracle:thin:@ldap://oid:389/cn=OracleContext,dc=[employer],dc=com/SQSOP051                              \
  --username [user]                                 \
  --num-mappers 10                                  \
  --hive-import                                     \
  --query "SELECT disc_prof_sk_id, 
       clm_rt_disc_ind, 
       easy_pay_pln_disc_ind, 
       To_char(l40_atomic_ts,'YYYY') AS year, 
       To_char(l40_atomic_ts,'MM')   AS month 
    FROM   ${DataSource[index]}.$tablename 
    WHERE  \$conditions"                            \
  --hcatalog-database [dru_user]                    \
  --hcatalog-partition-key     YEAR                 \
  --hcatalog-partition-values '2015'                \
  --target-dir hdfs://nameservice1/data/res/warehouse/finance/[dru_user]/Claims_Data/$TableName                                                   \
  --hcatalog-table $TableName                       \
  --split-by ${SplitBy[index]}                      \
  --delete-target-dir                               \
  --direct                                          \
  --null-string '\\N'                               \
  --null-non-string '\\N'                           \
  --as-parquetfile