Hive表定义 - 多个空格分隔符

时间:2014-08-27 21:19:23

标签: hive field delimiter space

我正在定义一个hive表,其中数据在每个字段之间有1到n个空格。 在这种情况下如何定义分隔符值?

我最初将表格定义为:

CREATE EXTERNAL TABLE rtt (
field1 STRING,
field2 STRING,
field3 STRING,
field4 STRING,
field5 STRING,
field6 INT,
field7 FLOAT)
COMMENT 'New data set'

PARTITIONED BY (year INT, month INT, day INT)

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ' '

LINES TERMINATED BY '\n'

STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION '/test-dir/raw/2014/08/07/';

1 个答案:

答案 0 :(得分:-1)

尝试REGEX SERDE,例如,如

中所述

Create HIVE Table with multi character delimiter

我认为你想用作分隔符的正则表达式是“\ s +”