使用sed仅打印每个段落的第一个单词

时间:2013-05-05 14:49:26

标签: sed

我想知道如何用sed one-liner打印每个段落的第一个单词。本案例中的段落由2个换行符后面的文本定义。

e.g。

This is a paragraph with some text. Some random text that is not really important.

This is another paragraph with some text.
However this sentence is still in the same paragraph.

这应该转换为

This

This

3 个答案:

答案 0 :(得分:7)

paragraph mode

By a special dispensation, an empty string as the value of RS indicates that 
records are separated by one or more blank lines. 

awkperl支持“段落模式”,并且要么比sed做出更好的选择:

awk '{ print $1 }' RS= ORS="\n\n" file

perl -00 -lane 'print $F[0]' file

结果:

This

This

答案 1 :(得分:2)

可能的GNU sed解决方案是:

sed -rn ':a;/^ *$/{n;ba};s/( |$).*//p;:b;n;/^ *$/ba;bb'

输出:

This
This

它将仅限空格的行视为空,并理解段落之间的任何空行数。也正确处理单字段。

答案 2 :(得分:0)

这可能适合你(GNU sed):

sed ':a;$!{N;/\n\s*$/!ba};s/\s.*/\n/' file