使用sed

时间:2016-06-10 18:32:08

标签: macos sed

我有一个文本块,其中一些部分通过四空缩进明确描述:

PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

我希望每个缩进的块都会立即显示START QUOTE,然后紧跟END QUOTE。我一直在玩sed十五分钟,但仍然无法做到这一点。到目前为止,这是我最好的努力:

#!/usr/bin/sed -Ef
/^$/ {
N
    /\n    / {
    P
    s/^\n//
    i\
    START QUOTE
    }
}

/^    / {
N
    /\n$/ {
    s/\n$/&END QUOTE/
    G
    }
}

运行./parse.sed <script.txt,我得到以下输出:

PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

START QUOTE
    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

START QUOTE
    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.
END QUOTE

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

注意第一个引用块上缺少END QUOTE。我想这里发生的是脚本中的第二个命令:

/^    / {
N
    /\n$/ {
    s/\n$/&END QUOTE/
    G
    }
}
如果当前行是引用块的最后一行,

只能在块的末尾正确找到边界。但有时候,它会被一个人关闭,并且边界被两个单独的N命令摄取,因此无法识别。关于使用sed执行此操作的正确方法的任何指示都是?

4 个答案:

答案 0 :(得分:1)

使用sed

在查找引用的结尾时,原始脚本成对读取。因此,仅当报价包含奇数行时才会找到报价的结尾。解决方案是立即读取整个引用,然后将END QUOTE添加到它的末尾:

#!/usr/bin/sed -Ef
/^$/ {
N
    /\n    / {
    P
    s/^\n//
    i\
    START QUOTE
    }
}

/^    / {
    :a;N;/\n$/!ba
    s/$/END QUOTE\n/
}

这里的关键更改是:a;N;/\n$/!ba,它会读取行,直到找到空行。

[以上是在GNU sed下测试的。 BSD(OSX)sed通常略有不同。]

使用awk

sed可以做任何事情,但具有复杂逻辑的事情通常更容易使用awk。对于您的问题,请尝试:

awk '/^    / && q{print;next} q{print "END QUOTE"; q=0} /^    /{print "START QUOTE"; q=1} 1' file

根据您的输入,例如:

$ awk '/^    / && q{print;next} q{print "END QUOTE"; q=0} /^    /{print "START QUOTE"; q=1} 1' file
PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

START QUOTE
    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.
END QUOTE

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

START QUOTE
    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.
END QUOTE

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

如何运作

此脚本使用单个变量q,当我们在引号中时为1,否则为零。

  • /^ / && q{print;next}

    如果q为真且该行以4个空格开头,则打印该行,跳过其余命令并跳转到next行。

  • q{print "END QUOTE"; q=0}

    如果我们在q为真时到达此处,则此行不以4个空格开头。这意味着报价刚刚结束,我们打印END QUOTE并将q重置为false(0)。

  • /^ /{print "START QUOTE"; q=1}

    如果我们到达这里以4个空格开头的行,那么报价刚刚开始。我们打印START QUOTE并将q设置为true(1)。

  • 1

    这是awk用于打印线条的神秘简写。

答案 1 :(得分:1)

试试这个:

#!/usr/bin/sed -f
/^    / {
    H
    d
  }
/^$/ {
  x
  s/^\n    /START QUOTE&/
  /    /s/$/\nEND QUOTE\n/
}

添加以四个空格开头的行以保留空格并从模式空间中删除。

当找到下一个空行/^$/时,x会交换保留空间和图案空间的内容。然后,我们将START BLOCKEND BLOCK添加到块的开头和结尾。

答案 2 :(得分:1)

这可能适合你(GNU sed):

sed -r 'N;/^\n\s{4}\S/s//\nSTART QUOTE&/;/^\s{4}\S.*\n$/s//&END QUOTE\n/;t;P;D' file

在一对行(N ... P;D)的运行窗口中处理文件。当所需的对匹配前置/附加所需的文字,然后纾困(参见t),然后继续下一对线。

另一种方法:

sed '/^    /{s/^/START QUOTE\n/;:a;n;/^    /ba;s/^/END QUOTE\n/}'  file

答案 3 :(得分:1)

sed用于单个行上的简单替换,即全部。对于其他任何你应该使用awk:

$ cat tst.awk
!inBlock && /^    / { print "START QUOTE"; inBlock=1 }
inBlock && !/^    / { print "END QUOTE"; inBlock=0 }
{ print }

$ awk -f tst.awk file
PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that.

START QUOTE
    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.
END QUOTE

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

START QUOTE
    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.
END QUOTE

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.