删除文本到空白行

时间:2015-05-07 01:11:17

标签: regex awk sed

所以我有一个报告日志文件,它代表了一堆缺少的源文件。我想清除那些没问题的文件。给出这个例子,我将如何删除该行"以下文件已被解决:"一切都在它之后直到空间?已解析文件数量的长度不同,因此在看到该短语后,我无法使用一定数量的行。

示例:

$medium: 600px;
$large: 950px;

@mixin responsive($width) {
    @if $fix-mqs {
        @if $fix-mqs >= $width {
            @content;
        }
    } @else {
        @media only screen and (min-width: $width) { 
            @content;
        }
    }
}

article {
    padding: 30px;

    @include responsive($medium) {
        padding: 50px;
    }

    @include responsive($large) {
        padding: 70px;
    }

    @include old-ie {
        //stuff for ie
    }
}

同样,我唯一要找的是包名和尚未解析的文件。

我确定我可以运行一些sed / awk命令。但我只是不能使用正则表达式来了解答案。 :(

当我尝试查找时,我得到的只是"删除空行",这不是我真正想要的。

提前致谢。

4 个答案:

答案 0 :(得分:1)

  

如何删除该行"以下文件已解决:"一切都在它之后直到空间?

我假设空格,你的意思是空行创建的空间。

使用sed

 $ sed '/The following files have been resolved/,/^$/d' file
------------------------------------------------------------------------
 Building karaf-parent 1.5.0-SNAPSHOT
 ------------------------------------------------------------------------

 --- maven-dependency-plugin:2.10:sources (default-cli) @ karaf-parent ---

 The following files have NOT been resolved:
    org.apache.karaf.features:standard:xml:sources:3.0.3:runtime

使用awk

$ awk '/The following files have been resolved/,/^$/{next;} 1' file
------------------------------------------------------------------------
 Building karaf-parent 1.5.0-SNAPSHOT
 ------------------------------------------------------------------------

 --- maven-dependency-plugin:2.10:sources (default-cli) @ karaf-parent ---

 The following files have NOT been resolved:
    org.apache.karaf.features:standard:xml:sources:3.0.3:runtime

替代问题:仅保留未解析的文件

$ awk '/The following files have NOT been resolved/,/^$/' file
 The following files have NOT been resolved:
    org.apache.karaf.features:standard:xml:sources:3.0.3:runtime

或者,没有标题:

$ awk ' /^$/{f=0} f{print} /The following files have NOT been resolved/{f=1}' file
    org.apache.karaf.features:standard:xml:sources:3.0.3:runtime

修订问题

a pastebin sample log开始,所有空行都不是空的。他们都至少有一个空间。我们可以处理。使用POSIX sed,以下内容应该有效:

sed '/The following files have been resolved/,/^[[:space:]]*$/d' monitor.log

[:space:]是指定空格的unicode安全方式。如果您的sed不支持它,请使用:

sed '/The following files have been resolved/,/^[ \t]*$/d' monitor.log

此外,在未编辑的日志中,感兴趣的行以[INFO]开头。无论行是否以[INFO]

开头,以下内容均有效
sed '/The following files have been resolved/,/^\([[]INFO[]]\)\?[ \t\r]*$/d' monitor.log

例如,考虑这个样本(从pastebin源中提取):

$ cat log2
[INFO] ------------------------------------------------------------------------
[INFO] Building yang-data-impl 0.7.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.10:sources (default-cli) @ yang-data-impl ---
[INFO] 
[INFO] The following files have been resolved:
[INFO]    org.opendaylight.yangtools:yang-binding:jar:sources:0.7.0-SNAPSHOT:compile
[INFO]    org.opendaylight.yangtools:yang-common:jar:sources:0.7.0-SNAPSHOT:compile
[INFO]    org.ow2.asm:asm:jar:sources:4.0:test
[INFO] 
[INFO] The following files have NOT been resolved:
[INFO]    antlr:antlr:jar:sources:2.7.7:test
[INFO] 

我们的sed命令的工作原理如下:

$ sed '/The following files have been resolved/,/^\([[]INFO[]]\)\?[ \t\r]*$/d' log2
[INFO] ------------------------------------------------------------------------
[INFO] Building yang-data-impl 0.7.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.10:sources (default-cli) @ yang-data-impl ---
[INFO] 
[INFO] The following files have NOT been resolved:
[INFO]    antlr:antlr:jar:sources:2.7.7:test
[INFO] 

答案 1 :(得分:0)

sed 1,/"NOT been resolved:"/d file

如果您确定未解析的行将是最后一个条目而没有其他文本(否则您将只需要抓取前一段),这是有效的。它的工作原理是删除第一行到匹配的所有行。

答案 2 :(得分:0)

感谢@ John1024,我走上了正确的轨道。

但是我找到了以下答案:

sed '/The following files have been resolved/,/^[ \t]*$/d' file 

答案 3 :(得分:0)

perl -n0E 'say $1 while /NOT been resolved:\n(.*?\n)\n/gs`