Question

我正在尝试匹配两个文件中的数据，并使用结果创建一个新文件。

文件1的数据如下所示：

19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
19XPT32-wipedrive-2016.05.03-05.50AM-d0.pdf
19XPT32-wipedrive-2016.07.06-08.32PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

文件2仅具有前7个字符，如下所示：

19V17R1
1BC6062

最终文件应如下所示：

19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

我可以通过只创建前7个字符的文件来匹配文件，然后执行以下操作：

awk 'FNR==NR{!a[$1]++;next}$0 in a' /RMAs.txt /sortedWipelogs.txt > matches.text

我不知道如何在第二列中输出整个文件名。谢谢。

Answer 1

如果两个文件都按所示排序，则只需

$ join -t- file1 file2

19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

对于所需的输出格式，这可能比设置-o的{{1}}选项更容易

join

Answer 2

请您尝试以下。

awk 'FNR==NR{a[$0]=$0;next} a[$1]{print a[$1],$0}' Input_file2  FS="-" Input_file1

说明： 现在添加上述代码的说明。

awk '
FNR==NR{                  ##Checking condition FNR==NR which will be true when first Input_file named file2 is being read.
  a[$0]=$0                ##Creating an array named a whose index is $0 and value is $0.
  next                    ##Using next will skip all further statements from here.
}                         ##Closing block for FNR==NR here.
a[$1]{                    ##Checking condition if a[$1] is NOT NULL then do following.
  print a[$1],$0          ##Printing value of array a whose index is $1 of current lie, along with the current line.
}' file2  FS="-" file1    ##Closing block and mentioning Input_file file2 name then setting FS="-" and mentioning Input_file name file1 here.

Answer 3

那就像创建以下go.awk一样简单：

NR==FNR { lookup[substr($0,1,7)] = $0 }
NR!=FNR { print $0" "lookup[$0] }

然后运行：

awk -f go.awk file1.txt file2.txt

对 first 输入文件中的每一行执行第一条命令，它仅将整行存储在关联数组中，并以前七个字符为键，以供以后查找。

对于第二个及其后的输入文件中的每个文件，第二个命令将在关联数组中输出行和相关条目。您看到的输出正是您所要求的：

19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

现在，我更喜欢使用脚本，因为这意味着我不必在历史记录中搜索任意复杂的awk命令，但是，如果您希望单线执行同一件事：

awk 'NR==FNR{lookup[substr($0,1,7)]=$0}NR!=FNR{print $0" "lookup[$0]}' file1.txt file2.txt

Answer 4

使用Perl

perl -lne ' BEGIN { $x=join("|", map{chomp;$_} qx(cat mweb2.txt)) } s/^($x)/$1 $1/g and print '

使用输入

$ cat mweb1.txt
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
19XPT32-wipedrive-2016.05.03-05.50AM-d0.pdf
19XPT32-wipedrive-2016.07.06-08.32PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

$ cat mweb2.txt
19V17R1
1BC6062

$ perl -lne ' BEGIN { $x=join("|", map{chomp;$_} qx(cat mweb2.txt)) } s/^($x)/$1 $1/g and print ' mweb1.txt
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

$

Answer 5

有很多方法可以做到这一点。已经有一个join的答案。这是一个grep：

$ grep -F -f file2 file1
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

但是，这也可以匹配文件的其他部分，但是如果您确定格式的话。这样就可以了。您也确实不需要第一列，因为它们匹配！如果您想要第一列，可以像这样简单地完成

$ grep -F -f file2 file1 | awk '{print substr($0,1,7), $0 }'
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf

或者只是

$ awk '(NR==FNR){a[$1];next}(substr($0,1,7) in a){ print substr($0,1,7), $0 }' file2 file1

或更短，以-作为分隔符（仅适用于file1，以避免在file2中出现空白问题

$ awk '(NR==FNR){a[$1];next}($1 in a){ print $1, $0 }' file2 FS="-" file1

AWK中的字符串操作

5 个答案: