Hive插入从1个表覆盖到具有不同列数的另一个表

时间:2016-04-18 20:43:54

标签: sql hadoop hive hiveql

我有2个蜂巢表。

源表包含以下列:

  correspondence_id       | decimal(22,0)  |          |
| template_id             | decimal(18,0)  |          |
| language_cd             | varchar(6)     |          |
| delivery_channel_cd     | varchar(20)    |          |
| job_id                  | decimal(18,0)  |          |
| correspondence_content  | string         |          |
| create_user_id          | varchar(40)    |          |
| create_ts               | timestamp      |          |
| last_updt_user_id       | varchar(40)    |          |
| last_updt_ts            | timestamp      |          |
| data_src_id             | decimal(18,0)  |          |
| src_app_resource_cd     | varchar(50)

目的地包含以下列:

   correspondence_id        | decimal(22,0)         |                       |
| template_id              | decimal(18,0)         |                       |
| template_cd              | varchar(20)           |                       |
| template_type_cd         | varchar(40)           |                       |
| category_cd              | varchar(20)           |                       |
| language_cd              | varchar(6)            |                       |
| delivery_channel_cd      | varchar(20)           |                       |
| job_id                   | decimal(18,0)         |                       |
| correspondence_content   | string                |                       |
| create_user_id           | varchar(40)           |                       |
| create_ts                | timestamp             |                       |
| last_updt_user_id        | varchar(40)           |                       |
| last_updt_ts             | timestamp             |                       |
| data_src_id              | decimal(18,0)         |                       |
| src_app_resource_cd      | varchar(50)           |                       |
| part_create_year_num     | int                   |                       |
| part_create_month_num    | int                   |                       |
|                          | NULL                  | NULL                  |
| # Partition Information  | NULL                  | NULL                  |
| # col_name               | data_type             | comment               |
|                          | NULL                  | NULL                  |
| part_create_year_num     | int                   |                       |
| part_create_month_num    | int       

我使用以下查询来传输数据:

FROM source_table cc insert overwrite table 
destination_table partition 
(part_create_year_num=2016, part_create_month_num=9 )
select cc.correspondence_id, cc.template_id, cc.language_cd, cc.delivery_channel_cd, cc.job_id, 
cc.correspondence_content, cc.create_user_id, cc.create_ts, cc.last_updt_user_id, cc.last_updt_ts, 
cc.data_src_id, cc.src_app_resource_cd

但是当我运行此查询时,我收到以下错误

rror: Error while compiling statement: FAILED: SemanticException [Error 10044]: Line 1:79 Cannot insert into target table because column number/types are different '9': Table insclause-0 has 15 columns, but query has 12 columns. (state=42000,code=10044)
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10044]: Line 1:79 Cannot insert into target table because column number/types are different '9': Table insclause-0 has 15 columns, but query has 12 columns.

显然源表和目标表是不同的,但是我怎样才能使这个查询工作,我尝试过设置占位符值,但这也没有用。

1 个答案:

答案 0 :(得分:1)

目标表似乎有五个额外的列

正常列: 1 template_cd | varchar(20)| 2 template_type_cd | varchar(40)| | 3 category_cd | varchar(20)| |

分区列 4. part_create_year_num | int | | 5. part_create_month_num | int

查询应该是

Insert overwrite table Destination_table partition(part_create_year_num=2016, part_create_month_num=9 )  select
 correspondence_id,        
 template_id,      
 '' as template_cd,      
'' as  template_type_cd, 
 '' as category_cd,      
 language_cd,      
 delivery_channel_cd,
 job_id,
 correspondence_content,
 create_user_id,
 create_ts,
 last_updt_user_id,
 last_updt_ts,
 data_src_id,
 src_app_resource_cd
 from source_table