让我们假设我有一个文本文件,其中包含来自不同来源的记录。该文件如下所示:
1000 Once upon a time, happy end.
1001 Tornado in NY city, the statue was finally found.
1002 I bought her an iphone
yes
for $1000. And then
happy end.
1003 How many times
have I seen it?
not many. Actually.
1004 5 Cars. 2 Toys. 3 Birds.
每行以\n
开头,行号以{1000 ... 2000}开头。行号与带有标签\t
的文本分开。
那么如何在一个"."
中使用 sed 计算record
的出现次数?
sed 可以替换除模式中给出的字符之外的所有字符,而不将它们分组到[^...]
中吗?
输出应如下所示:
1000 1
1001 1
1002 2
1003 2
1004 3
答案 0 :(得分:3)
这是一种方法:
$ awk -v r=1000 '{print r++,split($0,a,".")-1}' RS="\n[0-9]+\t" file
1000 1
1001 1
1002 2
1003 2
1004 3