sqoop到hive分区导入

1。在mysql中创建一个包含4个字段（id，name，age，sex）的表

CREATE TABLE `mon2`
(`id` int, `name` varchar(43), `age` int, `sex` varchar(334))

2。使用csv abc.csv

将数据插入到mysql表中

1,mahesh,23,m
2,ramesh,32,m
3,prerna,43,f
4,jitu,23,m
5,sandip,32,m
6,gps,43,f

mysql> source location_of_your_csv/abc.csv

3。现在启动你的hadoop服务并转到$ SQOOP_HOME并为分区配置单元导入编写sqoop导入查询。

sqoop import \
--connect jdbc:mysql://localhost:3306/apr \
--username root \
--password root \
-e "select id, name, age from mon2 where sex='m' and \$CONDITIONS" \
--target-dir /user/hive/warehouse/hive_part \
--split-by id \
--hive-overwrite \
--hive-import \
--create-hive-table \
--hive-partition-key sex \
--hive-partition-value 'm' \
--fields-terminated-by ',' \
--hive-table mar.hive_part \
--direct

hive到sqoop导出分区

1。为加载数据创建hive_temp表

create table hive_temp
(id int, name string, age int, gender string)
row format delimited fields terminated by ',';

2。加载数据

load data local inpath '/home/zicone/Documents/pig_to_hbase/stack.csv' into table hive_temp;

3。创建一个分区表，其中包含您要分区的特定列。

create table hive_part1
(id int, name string, age int)
partitioned by (gender string)
row format delimited fields terminated by ',';

4。在hive_temp表中添加一个分区

alter table hive_part1 add partition(gender='m');

5。将数据从temp复制到hive_part表

insert overwrite table hive_part1 partition(gender='m')
select id, name, age from hive_temp where gender='m';

6。 sqoop export命令

在mysql中创建一个表

mysql> create table mon3 like mon2;

sqoop export \
--connect jdbc:mysql://localhost:3306/apr \
--table mon3 \
--export-dir /user/hive/warehouse/mar.db/hive_part1/gender=m \
-m 1 \
--username root \
--password root

现在转到mysql终端并运行

select * from mon3;

希望它适合你：）

Hadoop - sqoop导出/导入分区表

1 个答案:

sqoop到hive分区导入

1。在mysql中创建一个包含4个字段（id，name，age，sex）的表

2。使用csv abc.csv

3。现在启动你的hadoop服务并转到$ SQOOP_HOME并为分区配置单元导入编写sqoop导入查询。

hive到sqoop导出分区

1。为加载数据创建hive_temp表

2。加载数据

3。创建一个分区表，其中包含您要分区的特定列。

4。在hive_temp表中添加一个分区

5。将数据从temp复制到hive_part表

6。 sqoop export命令