如果将csv文件的分隔符值(例如逗号)作为字段值导入,则如何将csv文件导入hive表?

时间:2015-10-18 19:42:42

标签: hive

我想将csv文件导入hive表。 csv文件在字段值中具有逗号(,)。我们怎么能逃脱呢?

1 个答案:

答案 0 :(得分:2)

您可以根据以下条件使用CSV SerDe。

如果您的逗号字段位于带引号的字符串中。

sam,1,"sam is adventurous, brave"
bob,2,"bob is affectionate, affable"

CREATE EXTERNAL TABLE csv_table(name String, userid BIGINT,comment STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
   "separatorChar" = ",",
   "quoteChar"     = "\""   
  )   
STORED AS TEXTFILE
LOCATION 'location_of_csv_file';

如果您的逗号字段转义如下。

sam,1,sam is adventurous\, brave
bob,2,bob is affectionate\, affable

CREATE EXTERNAL TABLE csv_table(name String, userid BIGINT, comment STRING)
ROW FORMAT serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
   "separatorChar" = ",",
   "escapeChar"    = "\\" 
  )   
STORED AS TEXTFILE
LOCATION '/user/cloudera/input/csv';

在这两种情况下,输出如下:

hive> select * from csv_table;
OK
sam 1   sam is adventurous
bob 2   bob is affectionate