如何删除包含N个以上单词的行

时间:2015-07-02 14:21:30

标签: string bash

在bash中是否有一个好的单行删除文件中包含超过N个单词的行?

示例输入:

I want this, not that, but thank you it is very nice of you to offer.
The very long sentence finding form ordering system always and redundantly requires an initial, albeit annoying and sometimes nonsensical use of commas, completion of the form A-1 followed, after this has been processed by the finance department and is legal, by a positive approval that allows for the form B-1 to be completed after the affirmative response to the form A-1 is received.

示例输出:

I want this, not that, but thank you it is very nice of you to offer.

在Python中,我会编写类似这样的代码:

if len(line.split()) < 40:
    print line

2 个答案:

答案 0 :(得分:4)

请注意,此答案假定问题的第一种方法:如何打印那些短于给定字符数的行

awklength()

一起使用
awk 'length($0)<40' file

您甚至可以将长度作为参数:

awk -v maxsize=40 'length($0) < maxsize' file

10个字符的测试:

$ cat a
hello
how are you
i am fine but
i would like
to do other
things
$ awk 'length($0)<10' a
hello
things

如果您想使用sed,可以说:

sed -rn '/^.{,39}$/p' file

检查该行是否包含少于40个字符。如果是这样,它会打印出来。

答案 1 :(得分:3)

要仅显示包含少于40个单词的行,您可以使用awk:

awk 'NF < 40' file

使用默认字段分隔符,每个单词都被视为一个字段。打印少于40个字段的行。