我想将csv文件导入hive表。 csv文件在字段值中具有逗号(,)。我们怎么能逃脱呢?
答案 0 :(得分:2)
您可以根据以下条件使用CSV SerDe。
如果您的逗号字段位于带引号的字符串中。
sam,1,"sam is adventurous, brave"
bob,2,"bob is affectionate, affable"
CREATE EXTERNAL TABLE csv_table(name String, userid BIGINT,comment STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
"separatorChar" = ",",
"quoteChar" = "\""
)
STORED AS TEXTFILE
LOCATION 'location_of_csv_file';
如果您的逗号字段转义如下。
sam,1,sam is adventurous\, brave
bob,2,bob is affectionate\, affable
CREATE EXTERNAL TABLE csv_table(name String, userid BIGINT, comment STRING)
ROW FORMAT serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
"separatorChar" = ",",
"escapeChar" = "\\"
)
STORED AS TEXTFILE
LOCATION '/user/cloudera/input/csv';
在这两种情况下,输出如下:
hive> select * from csv_table;
OK
sam 1 sam is adventurous
bob 2 bob is affectionate