Question

我想知道如何从txt文件中过滤出编程语言的名称。我在AWK中使用了以下句子但是我无法得到我想要的东西：

($1 ~ /[A-Za-z]*/)  && ( ($3 ~ /-/) || ($4 ~ /-/) )

有关如何做的任何想法？因为你可以看到，没有常规的方式来写行。

换句话说，我有以下几行，但我只想打印编程语言名称

2.PAK - AI language with coroutines.  "The 2.PAK Language: Goals and
Description", L.F. Melli, Proc IJCAI 1975.

473L Query - English-like query language for Air Force 473L system.  Sammet
1969, p.665.  "Headquarters USAF Command and Control System Query
Language", Info Sys Sci, Proc 2nd Congress, Spartan Books 1965, pp.57-76.

3-LISP - Brian Smith.  A procedurally reflective dialect of LISP which uses
an infinite tower of interpreters.

我只想过滤并显示以下行：

2.PAK

473L Query 

3-LISP

编辑：现在，同样的句子适用于以下内容吗？

DML - 

  1. Data Management Language.  Early ALGOL-like language with lists,
graphics, on Honeywell 635.  

  2. "DML: A Meta-language and System for the Generation of Practical and
Efficient Compilers from Denotational Specifications"

我想我只需修复一些RS和FS的东西，这样我就能得到这条线？

DML

提前致谢！

Answer 1

看起来像＃34; - ＆＃34;在给定文件的情况下，它可能是一个好的分隔符：

$ cat /tmp/a 
2.PAK - AI language with coroutines.  "The 2.PAK Language: Goals and
Description", L.F. Melli, Proc IJCAI 1975.

473L Query - English-like query language for Air Force 473L system.  Sammet
1969, p.665.  "Headquarters USAF Command and Control System Query
Language", Info Sys Sci, Proc 2nd Congress, Spartan Books 1965, pp.57-76.

3-LISP - Brian Smith.  A procedurally reflective dialect of LISP which uses
an infinite tower of interpreters.

您可以使用以下内容：

$ awk -F ' - ' '/ - /{ print $1 }' /tmp/a
2.PAK
473L Query
3-LISP
$

Answer 2

如果我理解正确您的文件包含由空行分隔的多行“节”，并且每个“节”以语言名称后跟 - 开头，那么您可以写：

awk 'BEGIN { RS = "\n\n"; FS = " - " } { print $1 }'

BEGIN块（在读取第一条记录之前运行）将记录分隔符RS设置为"\n\n"（两个换行符，即空行），因此每个您的节是单个AWK记录，字段分隔符FS到 - ，因此语言名称是节的第一个“字段”。块{ print $1 }打印每条记录中的第一个字段。

在awk中打印一些列

2 个答案:

在awk中打​​印一些列

2 个答案:

在awk中打印一些列