我正在使用静态分区 处理数据如下
Id Name Salary Dept Doj
1,Murtaza,360000,Sales,2010
2,Soumya,478968,Admin,2011
3,Sneha,45789, Dev,2012
4,Asif ,145687, Qa,2012
5,Shreyashi,36598,Qa,2011
6,Adil,25987,Dev,2010
7,Yashwant,23982,Admin,2011
8,Mohsin,569875,2012
9,Anil,56798,Sales,2010
10,Balaji,56489,Sales,2012
11,Utsav,563895,Qa,2010
12,Anuj,546987,Dev,2010
create external table if not exists murtaza.PartSalaryReport (ID int,Name
string,Salary string,Dept string)
partitioned by (Doj string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
stored as textfile
location '/user/cts573151/externaltables';
LOAD DATA LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2010);
LOAD DATA LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2011);
LOAD DATA LOCAL INPATH '/home/cts573151/partition.txt'
overwrite into table murtaza.PartSalaryReport partition (Doj=2012);
Select * from murtaza.PartSalaryReport;`
现在问题是在我的hdfs位置,在外部表所在的位置,我应该明智地获取数据目录,直到它确定
`
[cts573151@aster2 ~]$ hadoop dfs -ls /user/cts573151/externaltables`
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Found 4 items
drwxr-xr-x - cts573151 supergroup 0 2016-12-12 13:06 /user/cts573151/externaltables/doj=2010
drwxr-xr-x - cts573151 supergroup 0 2016-12-12 13:06 /user/cts573151/externaltables/doj=2011
drwxr-xr-x - cts573151 supergroup 0 2016-12-12 13:06 /user/cts573151/externaltables/doj=2012
但是当我查看里面的数据时 drwxr-xr-x - cts573151 supergroup 0 2016-12-12 13:06 / user / cts573151 / externaltables / doj = 2010
它显示了2010,2011和2012年的所有数据,但它应该只显示2010年的数据
[cts573151@aster2 ~]$ hadoop dfs -ls /user/cts573151/externaltables/doj=2010
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Found 1 items
-rwxr-xr-x 3 cts573151 supergroup 270 2016-12-12 13:06 /user/cts573151/externaltables/doj=2010/partition.txt
[cts573151@aster2 ~]$ hadoop dfs -cat /user/cts573151/externaltables/doj=2010/partition.txt
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
1,Murtaza,360000,Sales,2010
2,Soumya,478968,Admin,2011
3,Sneha,45789,Dev,2012
4,Asif,145687,Qa,2012
5,Shreyashi,36598,Qa,2011
6,Adil,25987,Dev,2010
7,Yashwant,23982,Qa,2011
9,Anil,56798,Sales,2010
10,Balaji,56489,Sales,2012
11,Utsav,53895,Qa,2010
12,Anuj,54987,Dev,2010
[cts573151@aster2 ~]$
哪里错了???
答案 0 :(得分:0)
由于您在配置单元中创建外部表,因此您必须遵循以下命令集:
create external table if not exists murtaza.PartSalaryReport (
ID int, Name string, Salary string, Dept string)
partitioned by (Doj string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
stored as textfile
location '/user/cts573151/externaltables';
alter table murtaza.PartSalaryReport add partition (Doj=2010);
hdfs dfs -put /home/cts573151/partition1.txt /user/cts573151/externaltables/Doj=2010/
alter table murtaza.PartSalaryReport add partition (Doj=2011);
hdfs dfs -put /home/cts573151/partition2.txt /user/cts573151/externaltables/Doj=2011/
alter table murtaza.PartSalaryReport add partition (Doj=2012);
hdfs dfs -put /home/cts573151/partition3.txt /user/cts573151/externaltables/Doj=2012/
这些命令对我有用,希望它可以帮助你!!!