静态分区输出在配置单元

时间:2016-12-12 07:59:20

标签: hadoop hive

我正在使用静态分区 处理数据如下

Id Name Salary Dept Doj
1,Murtaza,360000,Sales,2010
2,Soumya,478968,Admin,2011
3,Sneha,45789, Dev,2012
4,Asif ,145687, Qa,2012
5,Shreyashi,36598,Qa,2011
6,Adil,25987,Dev,2010
7,Yashwant,23982,Admin,2011
8,Mohsin,569875,2012
9,Anil,56798,Sales,2010
10,Balaji,56489,Sales,2012
11,Utsav,563895,Qa,2010
12,Anuj,546987,Dev,2010

Hql用于创建Partitionng表并将数据加载到其中如下

create external table if not exists murtaza.PartSalaryReport (ID int,Name
string,Salary string,Dept string)
partitioned by (Doj string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
stored as textfile
location '/user/cts573151/externaltables';

LOAD DATA  LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2010);
LOAD DATA  LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2011);
LOAD DATA  LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2012);

Select * from murtaza.PartSalaryReport;`

现在问题是在我的hdfs位置,在外部表所在的位置,我应该明智地获取数据目录,直到它确定

`

[cts573151@aster2 ~]$ hadoop dfs -ls /user/cts573151/externaltables`
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 4 items
drwxr-xr-x   - cts573151 supergroup          0 2016-12-12 13:06    /user/cts573151/externaltables/doj=2010
drwxr-xr-x   - cts573151 supergroup          0 2016-12-12 13:06  /user/cts573151/externaltables/doj=2011
drwxr-xr-x   - cts573151 supergroup          0 2016-12-12 13:06 /user/cts573151/externaltables/doj=2012

但是当我查看里面的数据时     drwxr-xr-x - cts573151 supergroup 0 2016-12-12 13:06 / user / cts573151 / externaltables / doj = 2010

它显示了2010,2011和2012年的所有数据,但它应该只显示2010年的数据

[cts573151@aster2 ~]$ hadoop dfs -ls /user/cts573151/externaltables/doj=2010
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 1 items
-rwxr-xr-x   3 cts573151 supergroup        270 2016-12-12 13:06         /user/cts573151/externaltables/doj=2010/partition.txt
[cts573151@aster2 ~]$ hadoop dfs -cat   /user/cts573151/externaltables/doj=2010/partition.txt
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

1,Murtaza,360000,Sales,2010
2,Soumya,478968,Admin,2011
3,Sneha,45789,Dev,2012
4,Asif,145687,Qa,2012
5,Shreyashi,36598,Qa,2011
6,Adil,25987,Dev,2010
7,Yashwant,23982,Qa,2011
9,Anil,56798,Sales,2010
10,Balaji,56489,Sales,2012
11,Utsav,53895,Qa,2010
12,Anuj,54987,Dev,2010
[cts573151@aster2 ~]$

哪里错了???

1 个答案:

答案 0 :(得分:0)

由于您在配置单元中创建外部表,因此您必须遵循以下命令集:

create external table if not exists murtaza.PartSalaryReport (
ID int, Name string, Salary string, Dept string)
partitioned by (Doj string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
stored as textfile
location '/user/cts573151/externaltables';

alter table murtaza.PartSalaryReport add partition (Doj=2010);
hdfs dfs -put /home/cts573151/partition1.txt /user/cts573151/externaltables/Doj=2010/

alter table murtaza.PartSalaryReport add partition (Doj=2011);
hdfs dfs -put /home/cts573151/partition2.txt /user/cts573151/externaltables/Doj=2011/

alter table murtaza.PartSalaryReport add partition (Doj=2012);
hdfs dfs -put /home/cts573151/partition3.txt /user/cts573151/externaltables/Doj=2012/

这些命令对我有用,希望它可以帮助你!!!