Question

我有一个文件File1.txt，其中包含以下格式的数据

*ELEMENT_SHELL
$#   eid     pid      n1      n2      n3      n4      n5      n6         
 46573       1   48206   48210   48217   48205       0       0           
 46574       1   48205   48217   48218   48204       0       0
............................................................
.............................................................

我想搜索第三列中的数字，如48206，并从其他文件File2.txt中搜索，其格式类似于

text
text
48205       54.995392  -1.847287e-009      149.449997       0       0
48206       55.995308  -1.879442e-009      149.449997       0       0
48207       56.995224  -1.911598e-009      149.449997       0       0
text
48208       56.995224  -1.911598e-009      149.449997       0       0
...

并将完整的行与数字一起放回第一个文件中，并将其附加到最后。所以File1.text看起来像

*ELEMENT_SHELL
$#   eid     pid      n1      n2      n3      n4      n5      n6      
46573       1   48206   48210   48217   48205       0       0       
46574       1   48205   48217   48218   48204       0       0       
....................................................
............................................................
48206       55.995308  -1.879442e-009      149.449997       0       0

对SED或AWK的任何建议？

Answer 1

使用awk：

awk 'NR == FNR { print; if(NR > 2) { seen[$3] = 1 }; next } seen[$1]' file1 file2

代码的工作原理如下：

NR == FNR {       # while processing the first file
  print           # print the line (echoing file1 fully)
  if(NR > 2) {    # from the second line onward
    seen[$3] = 1  # remember the third fields you saw
  }
  next            # don't do anything else.
}
seen[$1]          # while processing the second file: select lines
                  # whose first field is one of the remembered fields.

然后，您可以将此输出重定向到另一个文件，然后将file1替换为该文件：

awk 'NR == FNR { print; if(NR > 2) { seen[$3] = 1 }; next } seen[$1]' file1 file2 > file1.new && mv file1.new file1

Answer 2

您可以使用awk从file1获取字段，然后使用grep在file2中找到匹配的行，然后简单地连接这些文件。

while read LINE
do
    SEARCHSTR=$(echo $LINE | awk '{print $3}')  
    grep "$SEARCHSTR" file2.txt >>append.txt
done < file1.txt

cat file1.txt append.txt >file1_append.txt

要让grep搜索不同字段的内容，必须构造一个正则表达式以包含所有字段，即

SEARCHSTR=$(echo $LINE | awk '{BEGIN {OFS="|";} print $3, $4, $5}')  
grep "($SEARCHSTR)" file2.txt >>append.txt

此处$SEARCHSTR已包含|分隔的字段的内容。

关于速度：如果文件的列位于固定位置，您可以使用cut代替awk，如下所示：

SEARCHSTR=$(echo $LINE | cut --output-delimiter="|" -c 15-20,21-26,27-32|tr -d " ")

从子文件中的母文件中搜索一个数字，并将子文件中的完整行追加到母文件中

2 个答案: