将数据加载到Hive表中时为空

时间:2014-06-27 04:35:28

标签: hive

我有一个由六个字段组成的文本文件。以下是样本:

1,8,07-Jun-14,12:31:38,14:54:04,0.5
1,7,07-Jun-14,10:18:34,13:30:56,0.5
1,6,05-Jun-14,13:37:43,15:18:57,0.5
1,8,03-Jun-14,08:15:10,11:28:17,0.5
1,7,05-Jun-14,07:15:40,11:15:24,0.5
1,2,05-Jun-14,10:09:04,11:42:54,0.5
1,9,05-Jun-14,11:46:22,13:54:30,0.5
1,3,03-Jun-14,07:14:10,10:47:10,0.5

将这些数据加载到hive表中时,我得到最后一列的空值。

以下是我的创建和加载语句:

CREATE EXTERNAL TABLE Kantar_Data(Home_Id INT,Channel INT, Date_Id STRING, Start_Time STRING, End_Time STRING,Weight FLOAT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

load data local inpath '/media/test/Kantar/output26.txt' into table Kantar_Data;

以下是我得到的示例输出:

1       6       02-Jun-14       07:12:42        10:25:53        NULL
1       6       05-Jun-14       07:24:39        12:15:57        NULL
1       9       07-Jun-14       12:07:43        14:40:59        NULL
1       5       04-Jun-14       08:09:54        08:11:11        NULL
1       8       07-Jun-14       08:01:42        10:44:53        NULL
1       5       02-Jun-14       08:33:29        12:02:03        NULL
1       3       03-Jun-14       09:53:21        10:48:08        NULL
1       6       02-Jun-14       11:44:06        11:56:40        NULL

任何人都可以告诉我这是什么问题吗?

提前致谢...

2 个答案:

答案 0 :(得分:0)

我不相信float是你想在这里使用的数据类型,我相信它应该是DECIMAL。请参阅此处的文档

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Decimals

答案 1 :(得分:0)

这完全正常,Flot是有效的数据类型:

我这样做:

CREATE EXTERNAL TABLE Kantar_Data(Home_Id INT,Channel INT, Date_Id STRING, Start_Time STRING, End_Time STRING,Weight FLOAT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

您不必将文件存储在hdfs上。您可以在服务器上的任何位置使用它,例如:我的文件是在路径'/root/User1/testfile.csv'中本地加载数据到表Kantar_Data;

hive> load data local inpath '/root/User1/testfile.csv'
    > into table Kantar_Data;
Copying data from file:/root/User1/testfile.csv
Copying file: file:/root/User1/testfile.csv
Loading data to table default.kantar_data
Table default.kantar_data stats: [num_partitions: 0, num_files: 3, num_rows: 0, total_size: 864, raw_data_size: 0]
OK
Time taken: 0.374 seconds
hive> select * from Kantar_Data;
OK
1   8   07-Jun-14   12:31:38    14:54:04    0.5
1   7   07-Jun-14   10:18:34    13:30:56    0.5
1   6   05-Jun-14   13:37:43    15:18:57    0.5
1   8   03-Jun-14   08:15:10    11:28:17    0.5
1   7   05-Jun-14   07:15:40    11:15:24    0.5
1   2   05-Jun-14   10:09:04    11:42:54    0.5
1   9   05-Jun-14   11:46:22    13:54:30    0.5
1   3   03-Jun-14   07:14:10    10:47:10    0.5
1   8   07-Jun-14   12:31:38    14:54:04    0.5
1   7   07-Jun-14   10:18:34    13:30:56    0.5
1   6   05-Jun-14   13:37:43    15:18:57    0.5
1   8   03-Jun-14   08:15:10    11:28:17    0.5
1   7   05-Jun-14   07:15:40    11:15:24    0.5
1   2   05-Jun-14   10:09:04    11:42:54    0.5
1   9   05-Jun-14   11:46:22    13:54:30    0.5
1   3   03-Jun-14   07:14:10    10:47:10    0.5
1   8   07-Jun-14   12:31:38    14:54:04    0.5
1   7   07-Jun-14   10:18:34    13:30:56    0.5
1   6   05-Jun-14   13:37:43    15:18:57    0.5
1   8   03-Jun-14   08:15:10    11:28:17    0.5
1   7   05-Jun-14   07:15:40    11:15:24    0.5
1   2   05-Jun-14   10:09:04    11:42:54    0.5
1   9   05-Jun-14   11:46:22    13:54:30    0.5
1   3   03-Jun-14   07:14:10    10:47:10    0.5
Time taken: 0.112 seconds, Fetched: 24 row(s)

您的文件一定有问题,请检查!