如何过滤awk中的列?

时间:2013-08-11 21:45:40

标签: unix compiler-construction awk

我想知道如何过滤AWK中的以下几行:

DSL - 

  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analog
computer functions.  "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 for
the IBM 7090.  Sammet 1969, p.632.

FLIP - 

  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

  3. Formal LIst Processor.  Early language for pattern-matching on LISP
structures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.

所以我可以得到这样的东西:

DSL

FLIP

我在AWK中使用以下句子:

BEGIN { RS = "\n\n\n" ;  FS = " - " } 

{ print $1 }

但我得到的只是这个:

DSL

提前致谢!

5 个答案:

答案 0 :(得分:1)

假设格式是常量(第一个条目中没有空格):

if ($2=="-"){print $1}

编辑:但如果你有一个条目:

Objective C -
...

您需要以下内容:

if ($NF=="-"){$NF="";print}

awk非常擅长解析可预测格式的平面文件。

答案 1 :(得分:1)

看来你正在寻找只有两个单词的行,第二个单词是-。如果是这样,那么你可以写:

awk 'NF == 2 && $2 == "-" { print $1 }'

您可以进一步限定它以坚持$1从行的开头开始(没有前导空格):

awk '$0 !~ /^ / && NF == 2 && $2 == "-" { print $1 }'

这两个行都会在给定数据上生成仅包含DSLFLIP的行。

答案 2 :(得分:1)

@JonathanLeffler为您提供了一个很好的awk答案,但是如果您打算使用该格式的文件很多,您可能需要考虑重新格式化它们以使记录以换行符分隔每个列表项单行,例如:

$ cat file
DSL -

  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analog
computer functions.  "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 for
the IBM 7090.  Sammet 1969, p.632.

FLIP -

  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

  3. Formal LIst Processor.  Early language for pattern-matching on LISP
structures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.

$ awk '!/^[[:space:]]*$/{printf "%s%s", (NF==2 && /-[[:space:]]*$/ ? rs rs : (/^ +[[:digit:]]+\./ ? rs : "")), $0; rs="\n"} END{print ""}' file
DSL -
  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analogcomputer functions.  "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 forthe IBM 7090.  Sammet 1969, p.632.

FLIP -
  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).
  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
  3. Formal LIst Processor.  Early language for pattern-matching on LISPstructures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.Teitelman, Memo MAC-M-263, MIT 1966.

通过这种方式,您可以轻松处理输出以进行打印或执行其他任何操作,例如

1)打印每个标题行加上第一个项目符号:

$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} {print $1,$2}'
DSL -
  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analogcomputer functions.  "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 forthe IBM 7090.  Sammet 1969, p.632.

FLIP -
  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

2)打印标题行加上“FLIP”记录的第二个项目符号:

$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} /^FLIP -/{print $1,$3}'
FLIP -
  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

3)打印标题行加上该标题的项目符号项目:

$ awk '...' file | awk 'BEGIN{RS=""; FS=OFS="\n"} {print $1 NF-1}'
DSL - 1
FLIP - 3

等等。

答案 3 :(得分:0)

如果你要跳过的所有行都以空格开头,那么这将起作用:

awk -F"-" '{if (substr($1,1,1)!=" ")print $1}'

答案 4 :(得分:0)

grep 行可以为您完成:

grep -Po '.*(?= -\s*$)' file