Question

我是AWK编程的新手，我想知道如何过滤以下文字：

Goedel - Declarative language for AI, based on many-sorted logic.  Strongly
typed, polymorphic, declarative, with a module system.  Supports bignums
and sets.  "The Goedel Programming Language", P. M. Hill et al, MIT Press
1994, ISBN 0-262-08229-2.  Goedel 1.4 - partial implementation in SICStus
Prolog 2.1.
ftp://ftp.cs.bris.ac.uk/goedel
info: goedel@compsci.bristol.ac.uk

打印一下：

Goedel

我使用了以下句子，但它根本无法正常工作：

awk -F " - " "/ - /{ print $1 }"

它显示以下内容：

Goedel
1994, ISBN 0-262-08229-2.  Goedel 1.4

有人可以告诉我我要修改什么，以便得到我想要的东西吗？

提前致谢

Answer 1

awk 'BEGIN { RS = "" } { print $1 }' your_file.txt

表示：splits string into paragraphs by empty line，然后按默认分隔符（空格）拆分单词，最后打印每个段落的第一个单词（$ 1）

Answer 2

这种单行程可以满足您的要求：

awk -F ' - ' 'NF>1{print $1;exit}'

Answer 3

awk -F ' - ' ' { if (FNR % 4 == 1) next; print $1; }'

如果格式与下面的格式完全相同，那么上面的代码应该有效：

1 Author - ...
2 Year ...
3 URL
4 Extra info ...
5 Author - ...
6..N etc.

如果条目之间有空行，您可以将RS设置为空字符串，只要-F（awk脚本中的FS变量）的值，$1将成为作者）是一样的。这样做的好处是，如果您没有“info：...”或URL，您仍然可以区分条目，假设它不是“作者 - ... {newline}年... {newline} { newline} info：... {newline} {newline} Author - ...“（如果空行是分隔条目，则条目各部分之间不能有空行。）例如：

# A blank line is what separates each entry.
BEGIN { RS = ""; }

{ print $1; }

如果你有一个支持它的awk，你可以在必要时使RS成为多个字符串（例如RS = "\n--\n"表示单独用“ - ”分隔的条目）。如果你需要一个正则表达式或者只是没有支持多个字符记录分隔符的awk，你将被迫使用如下内容：

BEGIN { found_sep = 1; }

{ if (found_sep) { print $1; found_sep = 0; } }

# Entry separator is "--\n"
/^--$/ { found_sep = 1; }

更复杂的事情需要更多的样本输入。

用awk解析列

3 个答案: