如何在这些数据之上创建表格?

时间:2017-01-10 00:52:34

标签: sql hadoop hive hql

数据如下所示

name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg

除了这些数据之外,我还创建了一个根据数据进行映射的表但是,数据是常量,直到有新的行键开头,如下所示

新行:

name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg

如何为此创建表和架构?我已尝试通过表格映射字符串,但它没有成功。

您能告诉我使用哪个分隔符来创建表并获取数据的键值。

我试过

Create table dataset (
    name string,
    SIA string,
    Age string,
    Place string,
    Qtype string,
    ID string,
    inorg string,
    file string
) ROW SEPERATED BY '||' stored as textfile;

1 个答案:

答案 0 :(得分:0)

您必须编写自定义格式SERDE,因为您指定的格式不属于以下任何类别。

Avro (Hive 0.9.1 and later)
ORC (Hive 0.11 and later)
RegEx
Thrift
Parquet (Hive 0.13 and later)
CSV (Hive 0.14 and later)
JsonSerDe (Hive 0.12 and later in hcatalog-core)

您需要修改数据文件并重新生成||使用,并使其成为json,然后使用 JsonSerDe

或尝试RegEx