在保留源列的同时从参考文件中grep行?

时间:2013-09-16 22:24:53

标签: file grep

我有两张桌子。表1样本有多列,表2有一列。我的问题是,我如何根据表1中的值从表1中提取行。我想一个简单的grep应该可以工作,但我怎么能在每一行上做一个grep。我希望输出保留匹配的表2标识符。

谢谢!

期望的输出:

IPI00004233 IPI00514755;IPI00004233;IPI00106646;    Q9BRK5-1;Q9BRK5-2;
IPI00001849 IPI00420049;IPI00001849;    Q5SV97-1;Q5SV97-2;
...
......

表1:

IPI00436567;    Q6VEP3;
IPI00169105;IPI01010102;    Q8NH21;
IPI00465263;    Q6IEY1;
IPI00465263;    Q6IEY1;
IPI00478224;    A6NHI5;
IPI00853584;IPI00000733;IPI00166122;    Q96NU1-1;Q96NU1-2;
IPI00411886;IPI00921079;IPI00385785;    Q9Y3T9;
IPI01010975;IPI00418437;IPI01013997;IPI00329191;    Q6TDP4;
IPI00644132;IPI00844469;IPI00030240;    Q494U1-1;Q494U1-2;
IPI00420049;IPI00001849;    Q5SV97-1;Q5SV97-2;
IPI00966381;IPI00917954;IPI00028151;    Q9HCC6;
IPI00375631;    P05161;
IPI00374563;IPI00514026;IPI00976820;    O00468;
IPI00908418;    E7ERA6;
IPI00062955;IPI00002821;IPI00909677;    Q96HA4-1;Q96HA4-2;
IPI00641937;IPI00790556;IPI00889194;    Q6ZVT0-1;Q6ZVT0-2;Q6ZVT0-3;
IPI00001796;IPI00375404;IPI00217555;    Q9Y5U5-1;Q9Y5U5-2;Q9Y5U5-3;
IPI00515079;IPI00018859;    P43489;
IPI00514755;IPI00004233;IPI00106646;    Q9BRK5-1;Q9BRK5-2;
IPI00064848;    Q96L58;
IPI00373976;    Q5T7M4;
IPI00375728;IPI86;IPI00383350;  Q8N2K1-1;Q8N2K1-2;
IPI01022053;IPI00514605;IPI00514599;    P51172-1;P51172-2;

表2:

IPI00000207
IPI00000728
IPI00000733
IPI00000846
IPI00000893
IPI00001849
IPI00002214
IPI00002335
IPI00002349
IPI00002821
IPI00003362
IPI00003419
IPI00003865
IPI00004233
IPI00004399
IPI00004795
IPI00004977

1 个答案:

答案 0 :(得分:1)

您不能使用grep前置针头,因此无法使用-f file2

使用循环并手动预装:

while read token; do grep $token file1 |xargs -I{} echo $token {} ; done <file2

或者,您可以同时存储grepgrep -o以及paste的结果:

grep -f 2.txt 1.txt >a
grep -of 2.txt 1.txt >b
paste b a

如果你使用awk也没问题,试试这个:

awk 'FNR==NR { a[$0];next } { for (x in a) if ($0 ~ x) print x, $0 }' 2.txt 1.txt

说明:对于第一个文件(只要FNR==NR),将所有针存储到数组a{ a[$0];next })中。然后(隐式)循环遍历第二个文件的所有行,再次遍历所有针并打印针和线(如果找到)。