数据如下所示
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
除了这些数据之外,我还创建了一个根据数据进行映射的表但是,数据是常量,直到有新的行键开头,如下所示
新行:
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg
如何为此创建表和架构?我已尝试通过表格映射字符串,但它没有成功。
您能告诉我使用哪个分隔符来创建表并获取数据的键值。
我试过
Create table dataset (
name string,
SIA string,
Age string,
Place string,
Qtype string,
ID string,
inorg string,
file string
) ROW SEPERATED BY '||' stored as textfile;
答案 0 :(得分:0)
您必须编写自定义格式SERDE,因为您指定的格式不属于以下任何类别。
Avro (Hive 0.9.1 and later)
ORC (Hive 0.11 and later)
RegEx
Thrift
Parquet (Hive 0.13 and later)
CSV (Hive 0.14 and later)
JsonSerDe (Hive 0.12 and later in hcatalog-core)
您需要修改数据文件并重新生成||使用,并使其成为json,然后使用 JsonSerDe
或尝试RegEx