使用awk来计算两列的差异

时间:2017-06-13 20:21:19

标签: bash awk

我想在csv文件中计算一些列差异,比如说

文件:

item1,0.01,0.1
item2,0.02,0.2
item3,0.03,0.3

预期的输出文件:

item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27

我试过这样的事情:

awk -F, '{print $2-$3 "," $0}'

并在第一列中获得了差异,但无法将其列入第4列! 以下没有工作,并给了我奇怪的结果,如:',$ 0 [原始行]'。

awk -F, '{print $0 "," $2-$3}'

这里发生了什么?以及如何解决这个问题? 我在bash下使用GNU awk。

还尝试了以下提示:subtract the values of two columns using awk or bash 例如,

awk '{ $4 = $2 - $3 } 1'

但也没有得到预期的结果。这是什么' 1'在命令的最后做什么?

更新: 我认为我的真实数据文件中存在错误:

fa8befbbf03c5539363996a576d5df20,0.725571036339,0.654274122734
fb51f93cc69b6be7375f518092330197,0.941242694855,0.888087145568
fc35b866ed1b3176193ccab251394cf2,0.0169462561607,0.10700264598
fd43d08452687499c00dc62511e5fb8c,0.13467258215,0.197959610293
fe4e8d77fa1770a331b3fca0f712d1a2,0.732236325741,0.302812807639
ff339fd5b4bfc7e916591ecc88286584,0.0581884384155,0.276936129794
ff34734e135192a75838d18e870bec86,0.941790342331,0.680042603973
ff34be2a8759cadcae3ea0fc74d7ef7e,0.111128211021,0.0429052298147
ff910f590b8d19dbc135d69a4bb6dc3e,0.400317430496,0.623828952199
ff9be3a6286f90d0b3ce7b049ac1cb9a,0.0130054950714,0.0511833470525

这似乎打破了下面提出的两个解决方案。

$ awk -F, -v OFS="," '{$4=$2-$3}1' file2.csv 
,0.07129693c5539363996a576d5df20,0.725571036339,0.654274122734
,0.05315559b6be7375f518092330197,0.941242694855,0.888087145568
,-0.0900564b3176193ccab251394cf2,0.0169462561607,0.10700264598
,-0.063287687499c00dc62511e5fb8c,0.13467258215,0.197959610293
,0.429424a1770a331b3fca0f712d1a2,0.732236325741,0.302812807639
,-0.218748bfc7e916591ecc88286584,0.0581884384155,0.276936129794
,0.26174835192a75838d18e870bec86,0.941790342331,0.680042603973
,0.068223759cadcae3ea0fc74d7ef7e,0.111128211021,0.0429052298147
,-0.2235128d19dbc135d69a4bb6dc3e,0.400317430496,0.623828952199
,-0.0381779f90d0b3ce7b049ac1cb9a,0.0130054950714,0.0511833470525

他们都在示例数据文件上运行良好。

谢谢!

2 个答案:

答案 0 :(得分:3)

$ awk -F, -v OFS="," '{$4=$2-$3}1' file1
# Or awk -F, -v OFS="," '{$(NF+1)=$2-$3}1' file1
item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27

$NF是最后一个字段 $(NF+1)是最后一个字段后的一个额外字段 OFS是输出字段分隔符
F(或FS)是输入字段分隔符

1的作用:
awk语法始终遵循规则condition{actions} 您可以省略conditions,也可以省略actions,但不能同时省略{。}。

如果省略conditions,则假设条件1 = True =始终执行以下操作{actions}

如果省略{actions},则会执行默认操作 - > {print $0}

因此,单个1是一个简单的真实条件=执行默认操作= {print $0} - 迄今已修改$0

答案 1 :(得分:2)

$ cat items.csv 
item1,0.01,0.1
item2,0.02,0.2
item3,0.03,0.3

$ awk -F, -v OFS=',' '{ print $0,$2-$3 }' items.csv 
item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27