我有以下文件:extract_info.txt
ABC
PNG
CHNS
和to_extractfrom.txt,我需要从中检索信息:
ABC 123 234 TCHSL
NBV 234 23764 DHG
CHNS 123 347 CGJKS
CVS 233 4747 JSHGD
PNG 122 324 HGH
SJDH 373 3487 JHG
我正在运行以下代码
while read line
do
gene=$(echo $line | awk -F' ' '{print $1}')
app1=$(awk -v comp1="$gene" '(comp1==$1) {print $1 }' to_extractfrom.txt)
done < extract_info.txt
然而,我想要的输出是从文件to_extractfrom.txt中提取extract_info.txt中列的信息,这样我就得到了模式匹配行右边和下一行的前一行的第一列即对于第一个文件中的列,我将输出为:
NBV ABC -
SJDH PNG CVS
CVS CHNS NBV
答案 0 :(得分:3)
awk '
BEGIN {prev = "-"}
NR == FNR {extract[$1] = 1; next}
is_match {print $1, m1, m2; is_match = 0}
$1 in extract {is_match = 1; m1 = $1; m2 = prev}
{prev = $1}
' extract_info.txt to_extractfrom.txt
NBV ABC -
CVS CHNS NBV
SJDH PNG CVS
如果您的输出必须与extract_info文件的顺序相同,并且您使用GNU awk,则可以
gawk '
BEGIN {prev = "-"}
NR == FNR {extract[$1] = FNR; next}
is_match {output[m1] = $1 FS m1 FS m2; is_match = 0}
$1 in extract {is_match = 1; m1 = $1; m2 = prev}
{prev = $1}
END {
PROCINFO["sorted_in"] = "@val_num_asc"
for (key in extract) print output[key]
}
' extract_info.txt to_extractfrom.txt
NBV ABC -
SJDH PNG CVS
CVS CHNS NBV