Question

我有几个需要修改的文本文件。他们看起来像：

Tag: Brown
Chair
Pencil
Tag: Red
Apple
Shirt
Pant
         # <--- some files have one or more (about less than five) blank line(s)
Tag: Black
Wall

我想通过在＆＃34;标记：＆＃34;之后的单词来格式化它。作为变量，插入到下一行，直到它遇到其他＆＃34;标记：＆＃34;。＆＃34;标签之间的界限：＆＃34;可能有所不同所以这里输出格式示例：

Brown Chair and Chairs
Brown Pencil and Pencils
Red Apple and Apples
Red Shirt and Shirts
Red Pant and Pants
         # <--- blank line(s) retain BLANK(s)
Black Wall and Walls

我在http://sed.sourceforge.net/查看并修改了一些示例，但仍未成功。

sed ':loop; $!N; /^Tag:/h; n; /^Tag:/!b next; t loop; :next; x; p; x'

谢谢。

** **更新

作为@jaypal的建议并仔细查看＆＃34;＆＃34;在每个文本文件中，我添加＆＃34;空白行＆＃34;场景。

Answer 1

以下代码涉及最简单的多元化（如您的示例所示）：

awk '/^Tag:/ {c=$2; next} {print c, $1, "and", $1"s"}' file

如果模式匹配，请将第二个字段保存到c并跳到下一行。否则，用简单的复数打印行上的第一个单词。

对于能够将更多单词复数化的高端市场，您可以使用Lingua::EN::Inflect Perl模块：

perl -MLingua::EN::Inflect=PL -lane 'if(@F==2){$c=$F[1]}else{print "@{[$c,$_,q/and/,PL $_]}"}' file

使用-a启用自动分割模式。如果有两个字段，请将第二个字段保存到$c（您也可以使用正则表达式执行此操作，我只是想象一下）。否则，打印列表。使用@{[ ]}并用双引号括起来使用内置变量$"来加入列表，默认情况下这是一个空格。

测试出来：

$ cat file
Tag: Brown
Chair
Pencil
Tag: Red
Apple
Shirt
Pant
Tag: White
Mouse
$ perl -MLingua::EN::Inflect=PL -lane 'if(@F==2){$c=$F[1]}else{print "@{[$c,$_,q/and/,PL $_]}"}' file
Brown Chair and Chairs
Brown Pencil and Pencils
Red Apple and Apples
Red Shirt and Shirts
Red Pant and Pants
White Mouse and Mice

Answer 2

我对sed的尝试（没有循环，分支或反向引用，我喜欢简单的事情）：

sed '/Tag:/{s/Tag: //;h;d;};G;s/\(.*\)\n\(.*\)/\2 \1 and \1s/'

修改

保留空行：

sed '/Tag:/{s/Tag: //;h;d;};/./{G;s/\(.*\)\n\(.*\)/\2 \1 and \1s/;}'

Answer 3

给出一个输入文件，该问题已在问题中发布，并有2个空行：

$ awk '/^Tag:/{tag=$2; next} {print (NF ? tag " " $0 " and " $0 "s" : $0)}' file
Brown Chair and Chairs
Brown Pencil and Pencils
Red Apple and Apples
Red Shirt and Shirts
Red Pant and Pants


Black Wall and Walls

从标签搜索模式并使用sed或awk将其插入下一行

3 个答案: