如何使用AWK在循环中正确打印关联数组?

时间:2018-08-04 04:08:29

标签: arrays string awk substring

Beth    45  0
Danny   33  0
Thomas  22  40  
Mark    65  100 
Mary    29  121 
Susie   39  76.5
Joey    51  189.52
Peter   23  78.26
Maximus 34  289.71
Rebecca 21  45.79
Sophie  26  28.44
Barbara 24  107.36
Elizabeth   35  105.69
Peach   40  102.69
Lily    41  123 

上面是一个数据文件,其中包含三个字段:姓名,年龄,薪水。

我想打印30岁以上和30岁以下人群的平均工资,人数和姓名。

在本练习中,我想练习使用字符串作为下标。

这是我的AWK代码:

BEGIN { OFS = "\t\t" }   
{
    if ($2 < 30) 
    {   
        a = "age below 30";
        salary[a] += $NF; 
        count[a]++;
        name[a] = name[a] $1 "\t";
    }   
    else
    {   
        a = "age equals or above 30";
        salary[a] += $NF; 
        count[a]++;
        name[a] = name[a] $1 "\t";
    }   
}

END {
    for (a in salary)
        for (a in count)
            for (a in name)
            {
                print "The average salary of " a " is " salary[a] / count[a];
                print "There are " count[a] " people "  a ; 
                print "Their names are " name[a];
                print "********************************************************";
            }
}

以下是输出:

The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************

我很难理解输出。

我所期望的应该是这样的:

The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age equals or above 30 is 109.679
There are 6 people age below 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age equals or above 30 is 109.679
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age below 30 is 70.1417
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 9 people age equals or above 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age below 30 is 70.1417
There are 6 people age below 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************

所以我的第一个问题是:我在哪里理解错了?

我的第二个问题是: 我实际上不需要那么多循环。我只需要

The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth   Peach   Lily    
********************************************************
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Thomas  Mary    Peter   Rebecca Barbara Sophie  
********************************************************

for (a in salary, count, names)不起作用。有更好的方法吗?

1 个答案:

答案 0 :(得分:1)

for (x in salary)
    for (y in count)
        for (z in name)
            print "foo"

for every index in salary, loop through every index in count and while doing so, for every index in count loop through every index in name and print "foo" each time。因此,如果薪水,人数和姓名分别有3个条目,那么您将打印“ foo” 3 * 3 * 3 = 9次。

它比代码中的代码更复杂,因为您使用相同的变量来在嵌套循环的每个级别上保存每个数组的索引值:

for (a in salary)
    for (a in count)
        for (a in name)

所以我不确定awk将如何处理-甚至可能是未定义的行为。

由于所有3个数组都具有相同的索引,因此只需选择一个数组并循环使用其索引,然后就可以使用相同的索引访问所有3个数组。

$ cat tst.awk
{
    bracket = "age " ($2 < 30 ? "under" : "equals or above") " 30"

    names[bracket] = (bracket in names ? names[bracket] "\t" : "") $1
    count[bracket]++
    salary[bracket] += $NF
}
END {
    for (bracket in names) {
        print "The average salary of", bracket, "is", salary[bracket] / count[bracket]
        print "There are", count[bracket], "people",  bracket
        print "Their names are", names[bracket]
        print "********************************************************"
    }
}

$ awk -f tst.awk file
The average salary of age equals or above 30 is 109.679
There are 9 people age equals or above 30
Their names are Beth    Danny   Mark    Susie   Joey    Maximus Elizabeth       Peach   Lily
********************************************************
The average salary of age under 30 is 70.1417
There are 6 people age under 30
Their names are Thomas  Mary    Peter   Rebecca Sophie  Barbara
********************************************************