我有一个以下格式的文件:
0.019059000 15150000000
0.037088000 15150000000
0.035007000 15150000001
0.047622000 15150000001
0.053359000 15150000002
0.060405000 15150000002
0.068598000 15150000003
0.081587000 15150000003
当第2列相同时,我想减去第1列。例如,对于输入文件,我希望具有以下内容:
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
例如,输入文件上第2列上的所有值均成对出现 15150000000仅存在两次,15150000001仅存在两次,等等。
欢迎任何帮助!
答案 0 :(得分:4)
awk
来营救! (不进行错误检查。)
$ awk 'p==$2 {print $1-pv,p} {p=$2; pv=$1}' file
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
对于未排序但对于同一密钥再次进行两次记录
$ awk '$2 in a {print $1-a[$2],$2; delete a[$2]; next} {a[$2]=$1}' file
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
如果第二个值不总是大于第一个值,而您想要绝对差值
$ awk 'function abs(x) {return x<0?-x:x}
$2 in a {print abs($1-a[$2]),$2; delete a[$2]; next}
{a[$2]=$1}' file
答案 1 :(得分:1)
awk中的另一个,从大中减去小:
$ awk '{
if($2 in a) { # if another $2 already met
print ((s=$1-a[$2])>0?s:-s),$2 # subtract smaller from bigger
delete a[$2] # delete to save memory
} else
a[$2]=$1 # else store $2
}' <(shuf file) # shuf file to demo random order
# replace with just the file name
示例输出(由于shuf
随机):
0.007046 15150000002
0.018029 15150000000
0.012615 15150000001
0.012989 15150000003
答案 2 :(得分:0)
怎么样
awk '{a[$2] = $1 - a[$2]} END {for (b in a) print a[b], b}' file
嗯,我发现您有对个值。然后接受卡拉卡法的答案。