Question

我有一个包含两个colums的数据列表。其中一个发送邮件的Ip和其他列包含通过邮件发送的总字节数。我想要通过特定ip传输的所有数据的累计总数。假设有4个条目：

192.168.0.100 40k
192.168.0.123 20k
192.168.0.100 15k
192.168.0.240 20k

然后，输出应为：

192.168.0.100 55k
192.168.0.123 20k
192.168.0.240 20k

Answer 1

这就是：

$ awk '{a[$1]+=$2} END { for (i in a) print i, a[i]"k"}' file
192.168.0.123 20k
192.168.0.100 55k
192.168.0.240 20k

解释

{a[$1]+=$2}将累积值存储在数组a[]中，其索引是该行的第一个字段。
END { for (i in a) print i, a[i]"k"}循环显示打印总计的值。注意k必须专门打印。

Answer 2

这类似于awk解决方案，}{是END{}块的perl快捷方式

perl -anE'$h{$F[0]} += $_ for /(\d+)k$/ }{say "$_ $h{$_}k" for sort keys %h' file

Answer 3

我不知道每个地址发送了多少邮件，但忽略后缀可能会导致问题。这是使用awk和numfmt处理它的一种方法，这是GNU coreutils的最新成员：

# Lowercase k is a non-standard suffix and not supported by numfmt 
<file awk '$2=toupper($2)'  |

# We assume the k is IEC encoded, i.e. k=1024. Use --from=si if 1000 was intended
numfmt --field=2 --from=iec |

# Perform the summation, same as in @fedorqui's answer
awk '{ h[$1]+=$2 } END { for(k in h) print k, h[k] }' |

# Add appropriate suffixes. Again change to --to=si if k=1000
numfmt --field=2 --to=iec

输出：

192.168.0.100   55K
192.168.0.123   20K
192.168.0.240   20K

添加从特定Ips生成的总累积数据

3 个答案:

解释