用awk计算特定字段

时间:2015-11-09 18:37:14

标签: bash awk

我有一个带有这种信息的csv文件:

2013  Cat.1  10  Structure1  Code1  34.10
2014  Cat.1  25  Structure1  Code1  254.24
2013  Cat.2  250 Structure1  Code1  2456.4
2014  Cat.2  234 Structure1  Code1  2345.9
2013  Cat.1  5   Structure2  Code2  59
2013  Cat.1  1   Structure2  Code2  18
2014  Cat.1  8   Structure2  Code2  123
2014  Cat.1  1   Structure2  Code2  18
2013  Cat.2  64  Structure2  Code2  59
2013  Cat.2  8   Structure2  Code2  18
2014  Cat.2  70  Structure2  Code2  123
2014  Cat.2  11  Structure2  Code2  18

我想要的结果文件就是那种:

2013  Cat.1  10         Structure1  Code1  34.10
2014  Cat.1  25         Structure1  Code1  254.24
2013  Cat.2  250        Structure1  Code1  2456.4
2014  Cat.2  234        Structure1  Code1  2345.9
2013  Cat.1  6 (5+1)    Structure2  Code2  77 (59+18)
2014  Cat.1  9 (8+1)    Structure2  Code2  141 (123+18)
2013  Cat.2  72 (64+8)  Structure2  Code2  77 (59+18)
2014  Cat.2  81 (70+11) Structure2  Code2  141 (123+18)

使用awk可以吗?对于第二个结构,我在此示例中只有2个不同的字段,但可能更多......

我对编程非常陌生,特别是awk。

感谢您的回答!

4 个答案:

答案 0 :(得分:2)

awk救援!

不是完整的解决方案,但可能会给你一些想法

$awk  '{
    k = $1 FS $2 FS $4 FS $5
    a[k] += $3
    as[k] = as[k] ? as[k] "+" $3 : "(" $3
    b[k] += $6
    bs[k] = bs[k] ? bs[k] "+" $6 : "(" $6
  }

  END {
    for (k in a) {
      print k, a[k], as[k] ")", b[k], bs[k] ")"
    }
  }' file

会给你

2014 Cat.2 Structure2 Code2 81 (70+11) 141 (123+18)
2014 Cat.1 Structure2 Code2 9 (8+1) 141 (123+18)
2014 Cat.2 Structure1 Code1 234 (234) 2345.9 (2345.9)
2014 Cat.1 Structure1 Code1 25 (25) 254.24 (254.24)
2013 Cat.2 Structure2 Code2 72 (64+8) 77 (59+18)
2013 Cat.1 Structure2 Code2 6 (5+1) 77 (59+18)
2013 Cat.2 Structure1 Code1 250 (250) 2456.4 (2456.4)
2013 Cat.1 Structure1 Code1 10 (10) 34.1 (34.10)

请注意,列顺序已更改为重用k,单个条目值也包含在parans中。两者都可以轻松处理。

答案 1 :(得分:1)

另一个awk答案,GNU awk具体。我假设您实际上并不想打印出添加公式。

gawk '
  { data[$1 OFS $2][$4 OFS $5][1] += $3
    data[$1 OFS $2][$4 OFS $5][2] += $6 }
  END {
    for (k1 in data) {
      for (k2 in data[k1]) {
        print k1, data[k1][k2][1], k2, data[k1][k2][2]
      }
    }
  }
' | sort -k4,5 -k2,2 -k1,1 | column -t
2013  Cat.1  10   Structure1  Code1  34.1
2014  Cat.1  25   Structure1  Code1  254.24
2013  Cat.2  250  Structure1  Code1  2456.4
2014  Cat.2  234  Structure1  Code1  2345.9
2013  Cat.1  6    Structure2  Code2  77
2014  Cat.1  9    Structure2  Code2  141
2013  Cat.2  72   Structure2  Code2  77
2014  Cat.2  81   Structure2  Code2  141

答案 2 :(得分:0)

这是一个可能的答案:

import java.util.ArrayDeque;

import javax.swing.*;

public class QueueTest {

  public static MyQueue<String> displayMenu(MyQueue<String> queue) {
    // list of choices (array of Strings)
    String[] array = {"Offer Person", "Poll Person", "Peek Person", "Display Queue", "Exit Program"};
    int choice = 0;
    // display loop
    while (choice != array.length - 1) {
      choice = JOptionPane.showOptionDialog(null, // put in center of screen
                                            "Press a Button", // message to user
                                            "Queue (line) of People", // title of window
                                            JOptionPane.YES_NO_CANCEL_OPTION, // type of option
                                            JOptionPane.QUESTION_MESSAGE, // type of message
                                            null, // icon
                                            array, // array of strings
                                            array[array.length - 1]); // default choice (last one)

      switch(choice) {
        case 0:
          String name = JOptionPane.showInputDialog(null, "Enter person's name");
          queue.offer(name);
          break;
        case 1:
          JOptionPane.showMessageDialog(null, queue.poll() + " is next in line.");
          break;
        case 2:
          JOptionPane.showMessageDialog(null, queue.peek() + " is in front of the line.");
          break;
        case 3:
          String output = queue.toString();
          JOptionPane.showMessageDialog(null, output);
          break;
        case 4:
          System.out.println("Exiting program");
          break;
        default:
          System.out.println("Choice not valid");
          break;
      }
    }
    return queue;
  }//close displayMenu

  static class MyQueue<T>
    extends ArrayDeque<T> {
    public MyQueue() {
    }
  }

  public static void main(String[] args) {
    MyQueue<String> myQueue = new MyQueue<String>();
    displayMenu(myQueue);
  }
}

在此脚本中,一个或多个空格awk 'BEGIN{FS="[ ]+"; OFS="\t";} NR==FNR{ key = $1"-"$2"-"$4"-"$5 idx[key] = idx[key]+1 a[key][idx[key]] = $3 c[key][idx[key]] = $6 } NR!=FNR{ key = $1"-"$2"-"$4"-"$5 if(idx[key]==1){$1=$1; print ;next;} if(idx[key]<0){next;} line1 =" ("a[key][1] line2 =" ("c[key][1] sum1 = a[key][1] sum2 = c[key][1] for(i = 2; i< idx[key]; i++) { line1 = line1"+"a[key][i] line2 = line2"+"c[key][i] sum1 = sum1+a[key][i] sum2 = sum1+c[key][i] } sum1 = sum1 + a[key][idx[key]] sum2 = sum2 + c[key][idx[key]] line1 = sum1""line1"+"a[key][idx[key]]")" line2 = sum2""line2"+"c[key][idx[key]]")" print $1, $2, line1, $4, $5, line2 idx[key] = -1 }' inputFile inputFile 被解释为字段分隔符()。在输出中,字段由选项卡(FS="[ ]+")分隔 请注意,脚本以OFS="\t"两次作为参数调用 如果您的输入确实是csv文件,请尝试使用inputFile作为字段分隔符将其导出并设置,
问题中给出的输入的示例输出:

FS=OFS=","

答案 3 :(得分:0)

这个单行将完成这项工作:

awk 'BEGIN{g=1;s="%4s %5s %-12s %10s %5s %-12s\n"} f{printf s ,$1,$2,$3+a" ("a"+"$3")",$4,$5,$6+b" ("b"+"$6")";f=0;g=0} /Structure2/{a=$3;b=$6;f=g;g=1} /Structure1/{printf s,$1,$2,$3,$4,$5,$6}' file

2013  Cat.1  10         Structure1  Code1  34.10
2014  Cat.1  25         Structure1  Code1  254.24
2013  Cat.2  250        Structure1  Code1  2456.4
2014  Cat.2  234        Structure1  Code1  2345.9
2013  Cat.1  6 (5+1)    Structure2  Code2  77 (59+18)
2014  Cat.1  9 (8+1)    Structure2  Code2  141 (123+18)
2013  Cat.2  72 (64+8)  Structure2  Code2  77 (59+18)
2014  Cat.2  81 (70+11) Structure2  Code2  141 (123+18)

我为对齐添加了一个格式,我在第三列和第六列中使用了12(%-12s) - 如果数字越高,你就可以增加它。