Question

如果我有以下文件：

This file has two lines
This file has three lines
This file has four
This file has five lines

我想要file和lines，以便我有以下输出：

file lines
file lines
file
file lines

如果每行都找到两个匹配项，则在同一行上打印匹配项。如果只找到一个，打印它，留下一个占位符（空/空白/无论如何），然后移到下一行。

我试过这样做：

grep -oP '(file)|(lines)' example.txt | paste -d ' ' - -

但我明白了：

file lines
file lines
file file
lines

因为在第三行找不到lines，它会从下一行找到file并将其放在同一输出行上。

我基本上强迫paste填充输出中的插槽，无论每行都找到什么。

我该如何更改？

Answer 1

我假设file和lines实际上是带有自己匹配组的正则表达式。以下内容允许使用任何ERE：

#!/usr/bin/env bash

# replace these with any ERE-compliant regex of your choice
file_re='(file)'    # for instance: file_re='file=([^[:space:]]+)([[:space]]|$)'
lines_re='(lines)'

while IFS= read -r line; do
  # default to a blank placeholder if no matches exist
  file= lines=

  # compare against each regex; if one matches, assign the group contents to a variable
  [[ $line =~ $file_re ]] && file=${BASH_REMATCH[1]}
  [[ $line =~ $lines_re ]] && lines=${BASH_REMATCH[1]}

  # print a line of output if *either* regex matched.
  [[ $file || $lines ]] && printf '%s\t%s\n' "$file" "$lines"

done <"${1:-example.txt}" # with input from $1 if given, or example.txt otherwise

请参阅BashFAQ #1（＆＃34;如何逐行（和/或逐字段）读取文件（数据流，变量）？＆＃34; < / em>）有关此处使用的技术的描述。

根据您的输入，输出为：

file lines file lines file file lines

Answer 2

sed用于<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet"/> <link href="http://code.ionicframework.com/ionicons/2.0.1/css/ionicons.min.css" rel="stylesheet"/> <nav class="navbar navbar-fixed-top conversationHeader headerOnScroll"> <div class="container-fluid"> <div class="navbar-header horizontalLayout"> <a class="navbar-brand text-center conversationBackButton"> <span class="ionicons ion-android-arrow-back"></span> </a> <div class="conversationDetails"> <div>John Doe</div> <div class="composeMessageContainer"> Text </div> </div> <img class="img-circle img-responsive avatar" src="images/dp.png"> </div> </div> </nav>，grep用于s/old/new/。对于任何其他文本操作，您应该使用awk。

使用GNU awk为第3个arg匹配（）：

g/re/p

使用其他awks，您可以使用substr（）来捕获匹配的字符串：

$ awk '{f=match($0,/file/,a); f+=match($0,/lines/,b)} f{print a[0], b[0]}' file
file lines
file lines
file
file lines

如何将正则表达式匹配组放入单独的输出列中，正确处理缺失/空值？

2 个答案: