gawk - sorting values from array of arrays

时间:2015-06-25 18:30:03

标签: multidimensional-array gawk percentile

Using gawk 4 to build arrays of arrays and need to figure out percentile data from it. Need to sort values in ascending order which doesn't appear possible using asort when working with multidimensional arrays. Some of my values will be duplicate integers, but I need to keep all duplicates.

Here is what my data looks like. Element names for [a] and [b] end up being unique strings. Array [b] then has elements that are named 1,2,3,etc and contain as values the data I need to sort on.

mArray[a][b][1]=3456
mArray[a][b][2]=1456
mArray[a][b][3]=1456
...
mArray[a][b][1]=9233
mArray[a][b][2]=9233
mArray[a][b][3]=1234
...
mArray[a][b][1]=4567
mArray[a][b][2]=4567
mArray[a][b][3]=3097

I figure I can create regular arrays from each unique [a] element and insert values from it's corresponding [b][x] and then asort on that, but then I lose whatever duplicate values exist. Right now I am hacking it by walking mArray and writing to different files based on name of [a], printing out all values under [b][x] then running sort. Curious if there is a more elegant way of doing it.

Here is what I tried using asort against my mArray to test getting proper output. After 30mins I get no output or errors.

for ( a in mArray ) {
 for ( b in mArray[a] ) {
  n=asort(mArray[a][b][c])
  print n
 }
}

Background: parsing CSV reports from a network monitoring system, grabbing throughput sample data then aggregating those values across all interfaces to determine 95th percentile for total throughput of a device.

Edit

Desired output format after sorting would be:

mArray[a][b][1]=1456
mArray[a][b][2]=1456
mArray[a][b][3]=3456
.
mArray[a][b][1]=1234
mArray[a][b][2]=9233
mArray[a][b][3]=9233
...
mArray[a][b][1]=3097
mArray[a][b][2]=4567
mArray[a][b][3]=4567

1 个答案:

答案 0 :(得分:1)

你必须对myArray [a] [b]进行排序,而不是myArray [a] [b] [c],因为c甚至不存在;)

如果您不想排序,则必须将目的地作为第二个参数添加到asort。至少这是gawk,但我不知道从哪个版本。它是gawk 4。

然后你必须逐个打印一个数组......

for ( a in myArray ) {
 for ( b in myArray[a] ) {
  asort(myArray[a][b], n)
  for( i in n ) print "m["a"]["b"]["i"]="n[i]
 }
}