AWK - 剥离DDL行

时间:2014-06-11 13:44:03

标签: bash awk sed

我有以下文件:

CREATE TABLE "DB2INST1 "."EMAIL_ADDRESS"  (
                  "EMAIL_ADDRESS_ID" INTEGER NOT NULL ,
                  "PERSON_ID" INTEGER ,
                  "EMAIL_ADDRESS" VARCHAR(128) NOT NULL ,
                  "CC_ADDRESS" VARCHAR(128) ,
                  "UPPER_EMAIL_ADDRESS" VARCHAR(128) GENERATED ALWAYS AS (UPPER(EMAIL_ADDRESS)) ,
                 IN "USERSPACE5" INDEX IN "IDXSPACE5" LONG IN "LONGSPCE1" ;


CREATE TABLE "DB2INST "."FIELD_RESPONSE"  (
                  "FIELD_RESPONSE_ID" INTEGER NOT NULL ,
                  "CUSTOM_FIELD_ID" INTEGER ,
                  "NAME" VARCHAR(100) NOT NULL ,
                  "RESPONSE" VARCHAR(100) ,
                  "RESPONSE_LONG" CLOB(256000) LOGGED NOT COMPACT ,
                  "FIELD_RESPONSE_100" VARCHAR(100) GENERATED ALWAYS AS (case when RESPONSE is null or RESPONSE = '' then cast(RESPONSE_LONG as varchar(100)) else RESPONSE end) ,
                 COMPRESS YES
                 DATA CAPTURE CHANGES
                 IN "USERSPACE1" INDEX IN "IDXSPACE1" LONG IN "LONGSPCE1" ;

我可以使用以下AWK命令从DDL中删除每条记录:

awk '/^CREATE TABLE/ {print}' FS="\n" RS="" < src.ddl > tables.ddl 

现在,我正试图删除GENERATED ALWAYS以及之后的所有内容。我希望文件看起来像这样(结果):

CREATE TABLE "DB2INST1 "."EMAIL_ADDRESS"  (
                  "EMAIL_ADDRESS_ID" INTEGER NOT NULL ,
                  "PERSON_ID" INTEGER ,
                  "EMAIL_ADDRESS" VARCHAR(128) NOT NULL ,
                  "CC_ADDRESS" VARCHAR(128) ,
                  "UPPER_EMAIL_ADDRESS" VARCHAR(128)  ,
                 IN "USERSPACE5" INDEX IN "IDXSPACE5" LONG IN "LONGSPCE1" ;


CREATE TABLE "DB2INST "."FIELD_RESPONSE"  (
                  "FIELD_RESPONSE_ID" INTEGER NOT NULL ,
                  "CUSTOM_FIELD_ID" INTEGER ,
                  "NAME" VARCHAR(100) NOT NULL ,
                  "RESPONSE" VARCHAR(100) ,
                  "RESPONSE_LONG" CLOB(256000) LOGGED NOT COMPACT ,
                  "FIELD_RESPONSE_100" VARCHAR(100)  ,
                 COMPRESS YES
                 DATA CAPTURE CHANGES
                 IN "USERSPACE1" INDEX IN "IDXSPACE1" LONG IN "LONGSPCE1" ;

我试图在

中使用此AWK
|awk '{print $1 "  " $2 ", " }' < tables.ddl ...

但是,这只打印出两列数据。

有什么建议吗?

3 个答案:

答案 0 :(得分:2)

sed可以成为一个很好的工具:

sed 's/GENERATED ALWAYS AS.*$/,/' file

这样可以获取从GENERATED ALWAYS AS到行尾的所有内容,并用逗号替换它。

如果您想进行就地编辑,请使用-i。它将使用当前内容创建file.bakfile将包含新版本。

sed -i.bak 's/GENERATED ALWAYS AS.*$/,/' file

答案 1 :(得分:0)

对您的代码进行略微修改:使用gensub ...

awk '/^CREATE TABLE/ {print gensub(/GENERATED ALWAYS .*\n/,",\n","g") "\n";}' FS="\n" RS="" src.ddl > tables.ddl 

答案 2 :(得分:0)

你可以这样做:

awk 'BEGIN{FS="\n"; RS=""}/^CREATE TABLE/{sub(/GENERATED ALWAYS.*,/, ","); print}' src.ddl > tables.ddl 

预期产出:

CREATE TABLE "DB2INST1 "."EMAIL_ADDRESS"  (
                  "EMAIL_ADDRESS_ID" INTEGER NOT NULL ,
                  "PERSON_ID" INTEGER ,
                  "EMAIL_ADDRESS" VARCHAR(128) NOT NULL ,
                  "CC_ADDRESS" VARCHAR(128) ,
                  "UPPER_EMAIL_ADDRESS" VARCHAR(128) ,
                 IN "USERSPACE5" INDEX IN "IDXSPACE5" LONG IN "LONGSPCE1" ;
CREATE TABLE "DB2INST "."FIELD_RESPONSE"  (
                  "FIELD_RESPONSE_ID" INTEGER NOT NULL ,
                  "CUSTOM_FIELD_ID" INTEGER ,
                  "NAME" VARCHAR(100) NOT NULL ,
                  "RESPONSE" VARCHAR(100) ,
                  "RESPONSE_LONG" CLOB(256000) LOGGED NOT COMPACT ,
                  "FIELD_RESPONSE_100" VARCHAR(100) ,
                 COMPRESS YES
                 DATA CAPTURE CHANGES
                 IN "USERSPACE1" INDEX IN "IDXSPACE1" LONG IN "LONGSPCE1" ;