AWK:从另一个文件添加一个新字段

时间:2015-06-07 19:43:59

标签: linux bash loops awk

我有一个用" @"分隔的文件。它具有重复数据,可用于分段分割文件。在另一个文件中,我有数据,我想作为另一列添加到第一个文件。添加的数据源将与第一个文件中的每个重复数据实例一起循环。文件如下所示:

档案1

Race1@300Yards@6
Race2@300Yards@7
Race3@250Yards@7
Race4@250Yards@7
Race5@250Yards@8
Race6@250Yards@9
Race7@300Yards@10
Race8@300Yards@12
Race1@330Yards@10
Race2@300Yards@10
Race3@300Yards@10
Race4@300Yards@10
Race5@11/2Miles@11
Race6@7Miles@9
Race7@6Miles@8
Race8@51/2Miles@7
Race9@1Mile@8
Race10@51/2Miles@12
Race1@61/2Miles@6
Race2@11/16Miles@9
Race3@1Mile@9
Race4@11/2Miles@6
Race5@11/16Miles@10
Race6@1Mile@10
Race7@11/16Miles@12
Race8@1Mile@12

另一个文件如下:

文件2

London
New York
Dallas

所需的结果如下:

Race1@300Yards@6@London
Race2@300Yards@7@London
Race3@250Yards@7@London
Race4@250Yards@7@London
Race5@250Yards@8@London
Race6@250Yards@9@London
Race7@300Yards@10@London
Race8@300Yards@12@London
Race1@330Yards@10@New York
Race2@300Yards@10@New York
Race3@300Yards@10@New York
Race4@300Yards@10@New York
Race5@11/2Miles@11@New York
Race6@7Miles@9@New York
Race7@6Miles@8@New York
Race8@51/2Miles@7@New York
Race9@1Mile@8@New York
Race10@51/2Miles@12@New York
Race1@61/2Miles@6@Dallas
Race2@11/16Miles@9@Dallas
Race3@1Mile@9@Dallas
Race4@11/2Miles@6@Dallas
Race5@11/16Miles@10@Dallas
Race6@1Mile@10@Dallas
Race7@11/16Miles@12@Dallas
Race8@1Mile@12@Dallas

我知道awk可用于通过" Race1"来分割比赛位置。我认为它始于:

awk '/Race1/{x="Race"++i;}{print $5= something relating to file 2}

有人知道如何使用awk或任何其他Linux命令解析使用循环和条件的两个文件吗?

1 个答案:

答案 0 :(得分:1)

如果您将其保存为a.awk

BEGIN {
    FS  = OFS = "@"
    i = 0 
    j = -1
}
NR == FNR {
    a[i++] = $1        
}
NR != FNR {
    if ($1 == "Race1") 
        j++        
    $4 = a[j]
    print       
}

并运行

awk -f a.awk file2 file1

您将获得所需的结果。

输出

Race1@300Yards@6@London
Race2@300Yards@7@London
Race3@250Yards@7@London
Race4@250Yards@7@London
Race5@250Yards@8@London
Race6@250Yards@9@London
Race7@300Yards@10@London
Race8@300Yards@12@London
Race1@330Yards@10@New York
Race2@300Yards@10@New York
Race3@300Yards@10@New York
Race4@300Yards@10@New York
Race5@11/2Miles@11@New York
Race6@7Miles@9@New York
Race7@6Miles@8@New York
Race8@51/2Miles@7@New York
Race9@1Mile@8@New York
Race10@51/2Miles@12@New York
Race1@61/2Miles@6@Dallas
Race2@11/16Miles@9@Dallas
Race3@1Mile@9@Dallas
Race4@11/2Miles@6@Dallas
Race5@11/16Miles@10@Dallas
Race6@1Mile@10@Dallas
Race7@11/16Miles@12@Dallas
Race8@1Mile@12@Dallas

<强>解释

我们首先将输入和输出字段分隔符设置为@。我们还初始化将用作数组索引的变量i, j

第一个条件检查我们是否通过NR == FNR进行文件2。在第一个块中,我们将索引i与第一个字段(城市名称)相关联。然后我们增加i

第二个条件检查我们是否通过NR != FNR进行文件2。如果第一个字段等于Race1,那么我们递增j(注意我们将j初始化为-1)。我们将第4个字段设置为a[j],然后我们打印该行。