使用awk从表格文件中提取特定字段

时间:2018-12-07 10:21:13

标签: bash awk

我有一个具有这种结构的表格文件:

NAME                    ZONE
comp-envA-teamA-c9     europe-west4-a
comp-envA-teamA-11b    europe-west4-c
comp-envA-teamB-7r-v6  europe-west4-b
comp-envB-teamB-hx86   europe-west4-a
comp-envB-teamC-lbn7   europe-west4-c
envB-teamC-lcnh        europe-west4-a

我想提取这些行的一部分以获得第一行的预期输出:

"comp-envA-teamA-c9" is for "teamA" in zone "europe-west4-a"
"comp-envA-teamA-11b" is for "teamA" in zone "europe-west4-c"
"comp-envA-teamB-7r-v6" is for "teamB" in zone "europe-west4-b"

我用awk尝试了很多事情,但是无法管理。.

我的想法是首先在带有分隔符选项卡的初始awk中创建一个数组,该数组将第二个元素分配给第一个元素:即tab [comp-envA-teamA-c9] = europe-west4-a

然后在第二个awk中使用定界符“-”获取团队信息。

但是我无法创建这样的数组并将其传输到第二个awk

许多帮助!

4 个答案:

答案 0 :(得分:1)

如果单词team *可以位于第一个字符串的任何位置,则只能锁定该单词和分隔符[-]。

AWK解决方案:

awk 'NR>1 { match($1,/team[^- ]+/); print("\"" $1 "\" is for \"" substr($1,RSTART,RLENGTH) "\" in zone \"" $2 "\""); }'

测试:

$ awk 'NR>1 { match($1,/team[^- ]+/); print("\"" $1 "\" is for \"" substr($1,RSTART,RLENGTH) "\" in zone \"" $2 "\""); }' teams.txt
"gke-envA-teamA-c9" is for "teamA" in zone "europe-west4-a"
"gke-envA-teamA-11b" is for "teamA" in zone "europe-west4-c"
"gke-envA-teamB-7r-v6" is for "teamB" in zone "europe-west4-b"
"gke-envB-teamB-hx86" is for "teamB" in zone "europe-west4-a"
"gke-envB-teamC-lbn7" is for "teamC" in zone "europe-west4-c"
"envB-teamC-lcnh" is for "teamC" in zone "europe-west4-a"

答案 1 :(得分:0)

awk '
    function wrap_quotes(str){
        return "\""str"\""
    } 
    NR>1{
        split($1,name_infos,"-");
        print wrap_quotes($1) " is for " wrap_quotes(name_infos[length(name_infos) - 1]) " in zone " wrap_quotes($2)
    }' filename

返回

"comp-envA-teamA-c9" is for "teamA" in zone "europe-west4-a"
"comp-envA-teamA-11b" is for "teamA" in zone "europe-west4-c"
"comp-envA-teamB-7r-v6" is for "7r" in zone "europe-west4-b"
"comp-envB-teamB-hx86" is for "teamB" in zone "europe-west4-a"
"comp-envB-teamC-lbn7" is for "teamC" in zone "europe-west4-c"
"envB-teamC-lcnh" is for "teamC" in zone "europe-west4-a"

答案 2 :(得分:0)

$ cat tst.awk
BEGIN { ofmt = "\"%s\" is for \"%s\" in zone \"%s\"\n" }
NR>1 {
    n = split($1,t,/-/)
    printf ofmt, $1, t[(n>3?3:2)], $2
}

$ awk -f tst.awk file
"comp-envA-teamA-c9" is for "teamA" in zone "europe-west4-a"
"comp-envA-teamA-11b" is for "teamA" in zone "europe-west4-c"
"comp-envA-teamB-7r-v6" is for "teamB" in zone "europe-west4-b"
"comp-envB-teamB-hx86" is for "teamB" in zone "europe-west4-a"
"comp-envB-teamC-lbn7" is for "teamC" in zone "europe-west4-c"
"envB-teamC-lcnh" is for "teamC" in zone "europe-west4-a"

答案 3 :(得分:0)

如果您正在考虑使用Perl ..,那只是一支衬垫

/tmp> cat thomas.txt
comp-envA-teamA-c9     europe-west4-a
comp-envA-teamA-11b    europe-west4-c
comp-envA-teamB-7r-v6  europe-west4-b
comp-envB-teamB-hx86   europe-west4-a
comp-envB-teamC-lbn7   europe-west4-c
envB-teamC-lcnh        europe-west4-a
/tmp> perl -lane ' /(team.*?)-/; print "\"$F[0]\" is for \"$1\" in zone \"$F[1]\"" ' thomas.txt
"comp-envA-teamA-c9" is for "teamA" in zone "europe-west4-a"
"comp-envA-teamA-11b" is for "teamA" in zone "europe-west4-c"
"comp-envA-teamB-7r-v6" is for "teamB" in zone "europe-west4-b"
"comp-envB-teamB-hx86" is for "teamB" in zone "europe-west4-a"
"comp-envB-teamC-lbn7" is for "teamC" in zone "europe-west4-c"
"envB-teamC-lcnh" is for "teamC" in zone "europe-west4-a"
/tmp>