hive query csv Text-delimiter问题

时间:2016-03-31 06:18:11

标签: hadoop hive delimiter

尝试在hive中导入data以下。

姓名,电话,地址

Arverne,(718) 634-4784,"*312 Beach 54 Street 
Arverne, NY 11692
(40.59428994144626, -73.78442865540268)*"

Astoria,(718) 278-2220,"*14 01 Astoria Boulevard
Long Island City, NY 11102
(40.77152402451418, -73.92643545073543)*"

Auburndale,(718) 352-2027,"*25 55 Francis Lewis Boulevard
Flushing, NY 11358
(40.76035096822195, -73.79632645819947)*"

但地址不正确,因此表格数据损坏 我想线路终止的问题(默认情况下采用\ n,因为地址是3-4行),因为当我在样本数据下运行时

a,b,"e,f"

x,y,"l,m"

以下查询

create table test(c1 string, c2 string, c3 string)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
with serdeproperties(
"separatorChar" = ",");

工作正常:

test.c1 test.c2 test.c3

a   b   c,d

e   f   g,z

我如何做到这一点?

1 个答案:

答案 0 :(得分:0)

这就是我的成功方法。

>>> CREATE TABLE Test(name string, phone string, address string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
>>> load data inpath 'file.csv' into table Test;

>>>  select name from hiveTest;
+-------------+--+
|    name     |
+-------------+--+
| Arverne     |
| Astoria     |
| Auburndale  |
+-------------+--+
>>> select address from hiveTest;
+--------------------------------------------+--+
|                  address                   |
+--------------------------------------------+--+
| "312 Beach 54 Street Arverne               |
| "14 01 Astoria Boulevard Long Island City  |
| "25 55 Francis Lewis Boulevard Flushing    |
+--------------------------------------------+--+

我认为这有帮助。