为什么使用hdfs数据创建配置单元表时出现空表

时间:2019-01-24 10:53:52

标签: hive

我想在创建蜂巢表时将其直接插入hdfs文件,而不指定列名。

因此,该表的创建方式为:

CREATE EXTERNAL TABLE test10 (rec string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
ROW FORMAT DELIMITED;
LOCATION '/hdfs/data/adhoc/Prepublication-PUB_1EEVC-20171201/InterfacePublique-Controle-PUB_1EEVC-201711-PR-20181228-152828-indicateurs-PUB_1EEVC/*';

表已创建,但是当我想查看数据时,我是空的:

hive (indicateurs1)> select * from test10;
OK
Time taken: 0.471 seconds

供参考:

hadoop fs -cat /hdfs/data/adhoc/Prepublication-PUB_1EEVC-20171201/InterfacePublique-Controle-PUB_1EEVC-201711-PR-20181228-152828-indicateurs-PUB_1EEVC/*

给予

DIS_CD_EFS_PSE,01,,280237,68.12
DIS_CD_EFS_PSE,02,,18621,4.53
DIS_CD_EFS_PSE,03,,76818,18.67
DIS_CD_EFS_PSE,06,,781,0.19
DIS_CD_EFS_PSE,07,,296,0.07
DIS_CD_EFS_PSE,08,,238,0.06
DIS_CD_EFS_PSE,13,,8968,2.18

为什么表为空,如何找到数据?

2 个答案:

答案 0 :(得分:0)

位置应为文件夹名称,末尾不带/*。同时指定fields terminated by '\n'(与行相同):

CREATE EXTERNAL TABLE test10 (rec string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\n'
LINES TERMINATED BY '\n'
LOCATION '/hdfs/data/adhoc/Prepublication-PUB_1EEVC-20171201/InterfacePublique-Controle-PUB_1EEVC-201711-PR-20181228-152828-indicateurs-PUB_1EEVC';

答案 1 :(得分:0)

您的SQL

  CREATE EXTERNAL TABLE test10 (rec string)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n'
    ROW FORMAT DELIMITED;
    LOCATION '/hdfs/data/adhoc/Prepublication-PUB_1EEVC-20171201/InterfacePublique-Controle-PUB_1EEVC-201711-PR-20181228-152828-indicateurs-PUB_1EEVC/*'

那里有几个问题

  • 您两次提到ROW FORMAT DELIMITED
  • 即使在提供位置之前,您也将结束sql。您能在第二;结束时看到ROW FORMAT DELIMITED;
  • 您提供目录位置而不是文件