Question

请帮助我使用这个正则表达式，我需要每个第一个Meta Mapping的所有组件。

短语：。 \ nMeta Mapping *。* 这会是什么？我今天刚开始学习正则表达式。

到目前为止我有这个，我有点卡住了。我有下面的文档，以及我想要的输出。

主要文件：

Phrase: "is"

Phrase: "normal."
Meta Mapping (1000):
 1000   % Normal (Mean Percent of Normal) [Quantitative Concept]
Meta Mapping (1000):
 1000   Normal [Qualitative Concept]
Meta Mapping (1000):
 1000   % normal (Percent normal) [Quantitative Concept]
Processing 00000000.tx.8: The EKG shows nonspecific changes.

Phrase: "The EKG"
Meta Mapping (1000):
 1000   EKG (Electrocardiogram) [Finding]
Meta Mapping (1000):
 1000   EKG (Electrocardiography) [Diagnostic Procedure]

Phrase: "shows"
Meta Mapping (1000):
 1000   Show [Intellectual Product]

Phrase: "nonspecific changes."
Meta Mapping (901):
 694   Nonspecific [Idea or Concept]
 861   changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
 694   Nonspecific [Idea or Concept]
 861   changes (Changing) [Functional Concept]
Meta Mapping (901):
 694   Non-specific (Unspecified) [Qualitative Concept]
 861   changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
 694   Non-specific (Unspecified) [Qualitative Concept]
 861   changes (Changing) [Functional Concept]

我希望结果每个短语只有一个元映射。

所以

Phrase: "normal."
Meta Mapping (1000):
 1000   % Normal (Mean Percent of Normal) [Quantitative Concept]

Phrase: "The EKG"
Meta Mapping (1000):
 1000   EKG (Electrocardiogram) [Finding]

Phrase: "shows"
Meta Mapping (1000):
 1000   Show [Intellectual Product]

Phrase: "nonspecific changes."
Meta Mapping (901):
 694   Nonspecific [Idea or Concept]
 861   changes (Changed status) [Quantitative Concept]

请帮我这个正则表达式，我需要每个第一个Meta Mapping的所有组件。谢谢！

Answer 1

我认为这可能对你有用。只是重新，与awk无关。在此测试regex101.com/

Phrase.*\nMeta.*\n^((?![Meta|\n]).*\n)*

gnu awk版本：

cat your_data_file | awk  '
BEGIN {
    FS="\n"
    RS="\n\n"
    OFS="\n"
}
NF > 1 {
    print $1, $2
    for (i = 3; i <= NF; i++)
        if (match($i, "Meta Mapping")) {
            print ""
            next
        }
        else
            print $i
    print ""
}
'

Answer 2

带注释，符合POSIX的awk解决方案：

awk -v RS='' -F'\n' -v re='^Meta Mapping \\(' '
    # Only process non-empty records:
    # those that have at least 1 "Meta Mapping" line.
  $2 ~ re { 
    print $1 # print the "Phrase: " line
    print $2 # print the 1st "Meta Mapping" line.
      # Print the remaining lines, if any, up to but not including
      # the next "Meta Mapping" line.
    for (i=3;i<=NF;++i) {
      if ($i ~ re) break # next "Meta Mapping" found; ignore and terminate block.
      print $i
    }
    print "" # print empty line between output blocks
  }
' file

RS=''是一种awk成语，它通过空行将行分为记录：换句话说：每次运行非空行形成一条记录。
-F'\n'通过行将每条记录分成字段;即，$1指的是每条记录中的第1行，$2指的是第2行，......; NF包含当前记录中的行数（字段）。
re='...'定义了一个awk变量，其中包含一个标识每条记录中Meta Mapping行的正则表达式。

从行块中有选择地提取行

2 个答案: