我想知道如何用sed one-liner打印每个段落的第一个单词。本案例中的段落由2个换行符后面的文本定义。
e.g。
This is a paragraph with some text. Some random text that is not really important.
This is another paragraph with some text.
However this sentence is still in the same paragraph.
这应该转换为
This
This
答案 0 :(得分:7)
By a special dispensation, an empty string as the value of RS indicates that records are separated by one or more blank lines.
awk
或perl
支持“段落模式”,并且要么比sed
做出更好的选择:
awk '{ print $1 }' RS= ORS="\n\n" file
或
perl -00 -lane 'print $F[0]' file
结果:
This
This
答案 1 :(得分:2)
可能的GNU sed
解决方案是:
sed -rn ':a;/^ *$/{n;ba};s/( |$).*//p;:b;n;/^ *$/ba;bb'
输出:
This
This
它将仅限空格的行视为空,并理解段落之间的任何空行数。也正确处理单字段。
答案 2 :(得分:0)
这可能适合你(GNU sed):
sed ':a;$!{N;/\n\s*$/!ba};s/\s.*/\n/' file