我有两张桌子。表1样本有多列,表2有一列。我的问题是,我如何根据表1中的值从表1中提取行。我想一个简单的grep应该可以工作,但我怎么能在每一行上做一个grep。我希望输出保留匹配的表2标识符。
谢谢!
期望的输出:
IPI00004233 IPI00514755;IPI00004233;IPI00106646; Q9BRK5-1;Q9BRK5-2;
IPI00001849 IPI00420049;IPI00001849; Q5SV97-1;Q5SV97-2;
...
......
表1:
IPI00436567; Q6VEP3;
IPI00169105;IPI01010102; Q8NH21;
IPI00465263; Q6IEY1;
IPI00465263; Q6IEY1;
IPI00478224; A6NHI5;
IPI00853584;IPI00000733;IPI00166122; Q96NU1-1;Q96NU1-2;
IPI00411886;IPI00921079;IPI00385785; Q9Y3T9;
IPI01010975;IPI00418437;IPI01013997;IPI00329191; Q6TDP4;
IPI00644132;IPI00844469;IPI00030240; Q494U1-1;Q494U1-2;
IPI00420049;IPI00001849; Q5SV97-1;Q5SV97-2;
IPI00966381;IPI00917954;IPI00028151; Q9HCC6;
IPI00375631; P05161;
IPI00374563;IPI00514026;IPI00976820; O00468;
IPI00908418; E7ERA6;
IPI00062955;IPI00002821;IPI00909677; Q96HA4-1;Q96HA4-2;
IPI00641937;IPI00790556;IPI00889194; Q6ZVT0-1;Q6ZVT0-2;Q6ZVT0-3;
IPI00001796;IPI00375404;IPI00217555; Q9Y5U5-1;Q9Y5U5-2;Q9Y5U5-3;
IPI00515079;IPI00018859; P43489;
IPI00514755;IPI00004233;IPI00106646; Q9BRK5-1;Q9BRK5-2;
IPI00064848; Q96L58;
IPI00373976; Q5T7M4;
IPI00375728;IPI86;IPI00383350; Q8N2K1-1;Q8N2K1-2;
IPI01022053;IPI00514605;IPI00514599; P51172-1;P51172-2;
表2:
IPI00000207
IPI00000728
IPI00000733
IPI00000846
IPI00000893
IPI00001849
IPI00002214
IPI00002335
IPI00002349
IPI00002821
IPI00003362
IPI00003419
IPI00003865
IPI00004233
IPI00004399
IPI00004795
IPI00004977
答案 0 :(得分:1)
您不能使用grep前置针头,因此无法使用-f file2
。
使用循环并手动预装:
while read token; do grep $token file1 |xargs -I{} echo $token {} ; done <file2
或者,您可以同时存储grep
和grep -o
以及paste
的结果:
grep -f 2.txt 1.txt >a
grep -of 2.txt 1.txt >b
paste b a
如果你使用awk
也没问题,试试这个:
awk 'FNR==NR { a[$0];next } { for (x in a) if ($0 ~ x) print x, $0 }' 2.txt 1.txt
说明:对于第一个文件(只要FNR==NR
),将所有针存储到数组a
({ a[$0];next }
)中。然后(隐式)循环遍历第二个文件的所有行,再次遍历所有针并打印针和线(如果找到)。