Shell脚本使用AWK组合三个文件

时间:2012-07-10 14:37:28

标签: bash shell awk

我有三个文件G_P_map.txt,G_S_map.txt和S_P_map.txt。我必须使用awk组合这三个文件。示例内容如下 -

(G_P_map.txt包含)

test21g|A-CZ|1mos
test21g|A-CZ|2mos
 ...

(G_S_map.txt包含)

nwtestn5|A-CZ
nwtestn6|A-CZ
 ...

(S_P_map.txt包含)

3mos|nwtestn5
4mos|nwtestn6

预期产出:

1mos, 3mos
2mos, 4mos

这是我试过的代码。我能够结合前两个,但我不能与第三个一起。

awk -F"|" 'NR==FNR {file1[$1]=$1; next} {$2=file[$1]; print}' G_S_map.txt S_P_map.txt 

非常感谢任何想法/帮助。提前谢谢!

2 个答案:

答案 0 :(得分:3)

我会查看joincut的组合。

答案 1 :(得分:2)

GNU AWK(gawk)4有BEGINFILEENDFILE,这对此非常适合。但是,gawk手册包含一个为大多数AWK版本提供此功能的功能。

#!/usr/bin/awk

BEGIN {
    FS = "|"
}

function beginfile(ignoreme) {
    files++
}

function endfile(ignoreme) {
    # endfile() would be defined here if we were using it
}

FILENAME != _oldfilename \
{
    if (_oldfilename != "")
        endfile(_oldfilename)
    _oldfilename = FILENAME
    beginfile(FILENAME)
}

END   { endfile(FILENAME) }

files == 1 {    # save all the key, value pairs from file 1
    file1[$2] = $3
    next
}

files == 2 {    # save all the key, value pairs from file 2
    file2[$1] = $2
    next
}

files == 3 {    # perform the lookup and output
    print file1[file2[$2]], $1
}    

# Place the regular END block here, if needed. It would be in addition to the one above (there can be more than one)

像这样调用脚本:

./scriptname G_P_map.txt G_S_map.txt S_P_map.txt