提取配置单元中的表结构

时间:2018-09-25 07:36:23

标签: unix

我只需要提取仅包含列的create table结构。

 hive_table

show create table hive_table:
create table hive_table(id number,age number)
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION  'hdfs:/path/'

我只需要在下面

create table hive_table(id number,age number);

3 个答案:

答案 0 :(得分:0)

在这种情况下,Sed将很困难。检查此perl解决方案。

> cat hive_table.txt
show create table hive_table:
create table hive_table(id number,age number)
PARTITIONED BY ( year number )
STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION  'hdfs:/path/'
TBLPROPERTIES (   'spark.sql.sources ....)
> perl -ne 'BEGIN{$x=qx(cat hive_table.txt);$x=~s/(.+)(create table.+?)(PARTITIONED BY|STORED AS INPUTFORMAT|ROW FORMAT SERDE|OUTPUTFORMAT|LOCATION|TBLPROPERTIES)(.*)/$2/osm; print $x ; exit } '
create table hive_table(id number,age number)
>

答案 1 :(得分:0)

例如,在下面考虑

CREATE TABLE `test`(
  `id` string COMMENT '',
  `age` string COMMENT '',
  `city` string COMMENT '')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
WITH SERDEPROPERTIES (
  'path'='hdfs://local/')
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'hdfs://local/'
TBLPROPERTIES (
  'COLUMN_STATS_ACCURATE'='false',
  'EXTERNAL'='FALSE',
  'numFiles'='14',
  'numRows'='-1',
  'rawDataSize'='-1',
  'spark.sql.sources.provider'='orc',
  'spark.sql.sources.schema.numParts'='9',
'spark.sql.sources.schema.part.8'='....":{}}]}',
  'totalSize'='12',
  'transient_lastDdlTime'='12')

我会要求

CREATE TABLE `test`(
  `id` string COMMENT '',
  `age` string COMMENT '',
  `city` string COMMENT '')

答案 2 :(得分:0)

我正在尝试这样

#!/bin/bash
rm -f tableNames.txt
rm -f HiveTableDDL.txt
beeline --showHeader=false --outputformat=tsv2 -u jdbc:hive2:// -n hive -e "show tables like 'test*';" > tableNames.txt 
wait
while read LINE
do
   beeline --showHeader=false --outputformat=tsv2 -u jdbc:hive2:// -n hive -e "show create table $LINE" | perl -ne 'BEGIN{$x=qx(cat test.txt);$x=~s/(.+)(create table.+?)(ROW FORMAT SERDE|STORED AS INPUTFORMAT|ROW FORMAT SERDE|OUTPUTFORMAT|LOCATION|TBLPROPERTIES)(.*)/$2/osm; print "$x STORED AS ORC\n" ; exit } '
   printf  ";\n\n" 
done < tableNames.txt >> HiveTableDDL.txt
rm -f tableNames.txt
echo "Table DDL generated"