awk一行如果前两列相同则减去下一行

时间:2014-07-08 01:49:38

标签: awk

如果他们的1美元和1美元我们想减去17美元。 2美元是相同的:输入

targetID,cpd_number,Cell_assay_id,Cell_alt_assay_id,Cell_type_desc,Cell_Operator,Cell_result_value,Cell_unit_value,assay_id,alt_assay_id,type_desc,operator,result_value,unit_value,Ratio_operator,Ratio,log_ratio,Cell_experiment_date,experiment_date,Cell_discipline,discipline
 111,CPD-123456,2222,1111,IC50,,6.1,uM,1183,1265,Ki,,0.16,uM,,38.125,1.7511,2003-03-03 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
 111,CPD-123456,2222,1111,IC50,,9.02053,uM,1183,1265,Ki,,0.16,uM,,56.3783,-1.5812,2003-02-27 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
 111,CPD-777888,3333,4444,IC50,,6.1,uM,1183,1265,Ki,,0.16,uM,,38.125,-1,2003-03-03 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
 111,CPD-777888,3333,4444,IC50,,9.02053,uM,1183,1265,Ki,,0.16,uM,,56.3783,-3,2003-02-27 00:00:00,2003-02-10 00:00:00,Cell,Enzyme

所需的输出应为(1.7511 - ( - 1.5812)= 3.3323); (-1 - ( - 3)= 2)

3.3323
2

首次尝试:

awk -F, ' last != $1""$2 && last{     # ONLY When last key  "TargetID + Cpd_number"
      print C                 # differs from actual , print line + substraction
      C=0}                    # reset acumulators
  { # This block process each line of infile
      C -= $17                # C calc
   line=$0                    # Line will be actual line without activity
   last=$1""$2}               # Store the key in orther to track switching
   END{ # This block triggers after the complete file read
       # to print the last average that cannot be trigger during
       # the previous block
      print C}' input

它会输出:

-0.1699
 4

第二次尝试:

#!/bin/bash

tail -n+2 test > test2   # remove the title/header

awk -F, '$1 == $1 && $2 == $2 {print $17}' test2 >> test3  # print $17 if the $1 and $2 are the same

awk 'NR==1{s=$1;next}{s-=$1}END{print s}' test3

rm test2 test3

test3将是

 1.7511
-1.5812
-1
-3

输出

7.3323

任何一位大师都能发表一些评论吗?谢谢!

2 个答案:

答案 0 :(得分:1)

awk '
  BEGIN { FS = "," }

  NR == 1 { next }          # skip header line

  {                         # accumulate totals
    if ($1 SUBSEP $2 in a)    # if key already exists
      a[$1,$2] -= $17           # subtract $17 from value
    else                      # if first appearance of this key
      a[$1,$2] = $17            # set value to $17
  }

  END {                     # print results
    for (x in a)
      print a[x]
  }
' file

答案 1 :(得分:1)

你可以尝试下面的awk命令,

$ awk -F, 'NR==1{next} {var=$1; foo=$2; bar=$17; getline;} $1==var && $2==foo{xxx=bar-$17; print xxx}' file
3.3323
2