使用awk在一对括号之间提取文本

时间:2011-05-26 12:34:36

标签: postgresql awk plpgsql

我们有一个带有函数定义的SQL文件。我们想要读取这个文件并准备另一个SQL文件,其中包含第一个SQL文件中所有函数的所有drop DDL语句。

例如,第一个sql的内容如下:

CREATE OR REPLACE FUNCTION folder_cycle_check (folder_key INTEGER, new_parent_folder_key INTEGER) RETURNS VOID AS $procedure$

DECLARE 
    parent_of_parent INTEGER;
BEGIN
    IF folder_key = new_parent_folder_key THEN
        RAISE EXCEPTION 'Illegal cycle detected',new_parent_folder_key;
    END IF;
SELECT INTO parent_of_parent  (SELECT parent_folder_key FROM folder where folder_key = new_parent_folder_key);

IF new_parent_folder_key IS NOT NULL THEN
    PERFORM folder_cycle_check(folder_key, parent_of_parent);
END IF;

END; $procedure$
LANGUAGE plpgsql;

现在我想将目标SQL文件创建为:

DROP FUNCTION folder_cycle_check((folder_key INTEGER, new_parent_folder_key INTEGER)

为此我实现了一个“genDrop.txt”文件,我将其与第一个SQL文件一起传递给awk.exe命令。 “genDrop.txt”的问题在于它只生成带有drop语句的目标SQL文件:

DROP FUNCTION folder_cycle_check
which is not useful because PostgreSQL wants like this:
DROP FUNCTION folder_cycle_check(folder_key INTEGER, new_parent_folder_key INTEGER)

任何人都可以帮助我吗?我是awk编程的新手。 仅供参考,“genDrop.txt”是这样的:

#######################################################################
# AWK program to generate drop statements from create table, procedure, and view statements
############################################################################

function dropit(objtype, objname, rulename)
{
#   l[lines++] = "DROP " objtype " " objname " -- Line " NR ", Rule " rulename;
    l[lines++] = "DROP " objtype " " objname 
    next
}

function dropitpg(objtype, objname, funcargs, rulename)
{
#   l[lines++] = "DROP " objtype " " objname " -- Line " NR ", Rule " rulename;
    l[lines++] = "DROP " objtype " " objname " " funcargs
    next
}


BEGIN { FS="[ (;]*" }
# trim the line
{$2 = $2 }
# "grab creates" 
/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Pp][Rr][Oo][Cc]/             {dropit($3, $4, "CPs") }
/^[Cc][Rr][Ee][Aa][Tt][Ee] *[Pp][Rr][Oo][Cc]/               {dropit($2, $3, "CP") }

/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Pp][Rr][Oo][Cc]/ {dropit($5, $6, "CPs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Pp][Rr][Oo][Cc]/    {dropit($4, $5, "CP") }


/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Vv][Ii][Ee][Ww]/             {dropit($3, $4, "CVs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Vv][Ii][Ee][Ww]/                {dropit($2, $3, "CV") }

/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Vv][Ii][Ee][Ww]/ {dropit($5, $6, "CRVs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Vv][Ii][Ee][Ww]/    {dropit($4, $5, "CRV") }


/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Tt][Aa][Bb][Ll][Ee]/             {dropit($3, $4, "CTs") }
/^[Cc][Rr][Ee][Aa][Tt][Ee] *[Tt][Aa][Bb][Ll][Ee]/           {dropit($2, $3, "CT") }

/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Ss][Ee][Qq][Uu][Ee][Nn][Cc][Ee]/     {dropit($3, $4, "CSs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Ss][Ee][Qq][Uu][Ee][Nn][Cc][Ee]/    {dropit($2, $3, "CS") }

/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Ff][Uu][Nn][Cc][Tt][Ii][Oo][Nn]/     {dropit($3, $4, "CSs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Ff][Uu][Nn][Cc][Tt][Ii][Oo][Nn]/    {dropit($2, $3, "CS") }


END{
    print "-- Beginning " lines " drop statements"
    for (i = lines - 1; i >= 0; --i) {
    print l[i]
    print EOS
    print ""
    }
    print "-- End of " lines " drop statements"
}

BEGIN { FS="[ ;]*" }
/^ +[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Ff][Uu][Nn][Cc][Tt][Ii][Oo][Nn]/     {dropitpg($5, $7, "CSs") }
/[Cc][Rr][Ee][Aa][Tt][Ee] *[Oo][Rr] *[Rr][Ee][Pp][Ll][Aa][Cc][Ee] *[Ff][Uu][Nn][Cc][Tt][Ii][Oo][Nn]/    {sed -nr "s/\s*\[([^\]+)\]/\1/p" }

END{
    print "-- Beginning " lines " drop statements"
    for (i = lines - 1; i >= 0; --i) {
    print l[i]
    print EOS
    print ""
    }
    print "-- End of " lines " drop statements"
}

1 个答案:

答案 0 :(得分:1)

如果您的示例输出(减去额外的开放数据)就是您所需要的,那么我认为您的脚本是过度杀死的。怎么样?

#! /bin/awk -f
{
  if ($2 ~ /[Ff][Uu][Nn][Cc][Tt][Ii][Oo][Nn]/ ) {
      funcName=$3
      argSig=$0
      srchTarg= "^.*" funcName
      sub(srchTarg,"",argSig)
      # print "argSig=" argSig
      sub(/[\)].*$/, ")", argSig)
      # print "argSig=" argSig
      print "DROP FUNCTION " funcName argSig
    }
}

务必chmod 755 genDrop.awk

样本运行

(我将您的样本输入的第一行更改为)

  

CREATE FUNCTION folder_cycle_check(....

样本运行

$ genDrop.awk dropFunction.txt
DROP FUNCTION folder_cycle_check (folder_key INTEGER, new_parent_folder_key INTEGER)

另外,将你的awk脚本命名为genDrop.txt并没有帮助你传达你的意图,当然你的意思是genDrop.awk

我希望这会有所帮助。

请允许我欢迎您访问StackOverflow并提醒我们通常在这里做的三件事:1)当您收到帮助时,请尝试给予它,回答您在专业领域的问题2)阅读常见问题解答,http://tinyurl.com/2vycnvr ,3)当你看到好的Q& A时,使用灰色三角形http://i.stack.imgur.com/kygEP.png投票,因为系统的可信度是基于用户通过分享他们的知识而获得的声誉。还记得通过勾选复选标记http://i.stack.imgur.com/uqJeW.png

来接受更好地解决问题的答案(如果有的话)