Question

我正在Bash中实现一个注释功能，我正在为某些文本操作寻找awk或sed解决方案。

我想转换文件中的文字：

^version 10.2 tag1 tag2
^audit arg1 arg2
f()
{
...
}
g()
{
...
}
^version 10.2
h() { ... }
^version 10.2

i() { ... } # Not annotated: doesn't immediately follow an annotation

为：

annotate f^1 version 10.2 tag1 tag2
annotate f^1 audit arg1 arg2
f^1()
{
...
}
g()
{
...
}
annotate h^2 10.2
h^2() { ... }

i() { ... } # Not annotated: doesn't immediately follow an annotation

替换完成如下：

以^开头的行被替换为annotate，空格，注释行后面找到的函数名称，^，索引以及行的其余部分
函数名称后缀为^和索引（此后，索引递增）

函数名称从第1列开始，并且Bash函数名称不需要POSIX合规性（请参阅Bash源代码builtins/declare.def： shell函数名称不必是有效标识符;并且，在parse.y中，函数是WORD）。对于模式的功能部分来说，一个可接受的不完美的正则表达式是（但我会提出可以找出更好的正则表达式的解决方案，即使他们没有回答更大的问题 - 从阅读源代码很难理解）：

^[^'"()]\+\s*(\s*)

请注意，注释仅适用于匹配后的紧随其后的函数。如果函数没有立即跟随注释行，则根本不应该发出注释。

解决方案应该是通用的，不包括上面示例中找到的字符串（版本，审计，f，g，h等）。

解决方案不得要求CentOS 7 Minimal中找不到的实用程序/软件包。所以，不幸的是，Perl不能被考虑。我更喜欢awk解决方案。

您的答案将用于改进开源Bash项目的代码：Eggsh。

Answer 1

尝试这样的事情：

/^\^/ { if (ann == 0) count++; ann++; acc[ann] = substr($0, 2); next; }
/^[a-zA-Z0-9_]\s*(\s*)/ && ann {
    ind = index($0, "(");
    fname = substr($0, 1, ind-1)
    for (i = 1; i <= ann; i++) {
        print "annotate " fname "^" count " " acc[i];
    }
    print fname "^" count substr($0, ind);
    ann = 0;
    next;
}
{ ann = 0; print; }

请注意，我没有费心去做一个必要的研究来找到一个更好的函数名regexp。

如何：使用模式匹配中的文本进行多个多行替换？

1 个答案: