我有shell脚本,它为数据库中的所有表提取create table语句的语法。我一次循环一个create table语句,create table语句将作为循环中的变量$ DATA。我需要在partitioned by子句中的create table语句中提取列。
例如,$ DATA是循环中的变量
迭代输入1到循环:
DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100)) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"
迭代1的输出: DataOutput中=深度,permi
迭代输入2循环:
DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"
迭代2的输出: DataOutput中=深度
向循环输入迭代3:
DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100), `www` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"
迭代3的输出: DataOutput中=深度,permi,万维网
答案 0 :(得分:0)
试试这个:
my @bcktik = "";
while(<DATA>)
{
if($_=~m/PARTITIONED BY\s*\(((?:\(.*\)|[^\(])*)\)/i)
{
push(@bcktik, join "\,", ($1=~m/`([^`]*)`/g));
}
}
print "$_\n" for @bcktik;
__DATA__
CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100)) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')
CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')
CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100), `www` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')