根据文件中的多个列查找总和,并使用awk显示最高值和相应的行

时间:2016-05-31 19:15:38

标签: bash file unix awk

我有一个包含以下格式的5列的文件:

$cat test.txt
id;section;name;val1;val2
11;10;John;50;15
12;20;Sam;40;20
13;30;Jeny;30;30
14;10;Ted;60;10
15;10;Mary;30;5
16;20;Tim;15;15
17;30;Pen;20;100

我想根据传递的section_number(第2列)处理文件中的数据。我想显示传递的section_id的id,Name,Total(column4 + column5)。最后,我想打印总数最高的行信息。

我已经制作了如下的awk命令:

 section=10 ; awk -F";" -v var="$section" 'BEGIN { print "id Name Total" } { if ($2 == var) { sum = $4 + $5 ;print $1 " "$3 " " sum ;if (sum>newsum) {newsum=sum;name=$3;id=$1}}} END { print "Max sum for section "var" is "newsum " for Name: " name " and ID: " id }' test.txt;

它显示的数据如下:

id Name Total
11 John 65
14 Ted 70
15 Mary 35
Max sum for section 10 is 70 for Name: Ted and ID: 14

但是如果有多个记录具有与Total相同的最高值,如何处理方案?

1 个答案:

答案 0 :(得分:0)

这一切都取决于你想怎么处理它我猜?您可以说第一个使用数组获得先前>,最后>=或两者。

假设您要显示所有具有相同共享最高金额的所有内容:

% cat script.awk
BEGIN {
  FS=";";
  print "id Name Total";
}
$2 != var {next}           # If line doesn't match skip blocks
{
  sum = $4 + $5;
  print $1 " " $3 " " sum;
}
sum > max {                # If sum > max we need to reset the arrays (names and ids)
  max = sum;               # because we get a new winner
  delete names;
  delete ids;
  l = 0;
}
sum >= max {               # If sum is same or higher than max we will need to add this
  l++;                     # to the list of winners.
  names[l] = $3;
  ids[l] = $1;
}
END {
  printf "Max sum for section %s is %d for\n", var, max;

  # Iterate though all "winners" and print them
  for ( i = 1; i <= l; i++ ) {
    printf "Name: %s, ID: %s\n", names[i], ids[i];
  }
}

希望这能让您了解如何使用数组。

跑步:

section=10;
awk -F";" -v var="$section" -f script.awk test.txt
#                           ^ Instead of having awk on command line use script.awk