Question

尝试将数据合并到一行匹配的一行。

12345,this,is,one,line,1
13567,this,is,another,line,3
14689,and,this,is,another,6
12345,this,is,one,line,4
14689,and,this,is,another,10

输出

12345,this,is,one,line,1,4
13567,this,is,another,line,3
14689,and,this,is,another,6,10

由于

Answer 1

awk -F',' '{if($1 in a) {a[$1]=a[$1] "," $NF} else {a[$1]=$0}} END {asort(a); for(i in a) print a[i]}' < input.txt

适用于给定的例子。

这是相同awk脚本的注释文件版本， parse.awk 。请记住，此版本仅使用第一个字段作为统一行指示符。我将根据上面的作者评论重写它（所有字段，但最后一个字段）。

#!/usr/bin/awk -f

BEGIN {   # BEGIN section is executed once before input file's content
    FS=","   # input field separator is comma (can be set with -F argument on command line)
}

{   # main section is executed on every input line
    if($1 in a) {   # this checks is array 'a' already contain an element with index in first field
        a[$1]=a[$1] "," $NF   # if entry already exist, just concatenate last field of current row
    }
    else {   # if this line contains new entry
        a[$1]=$0   # add it as a new array element
    }
}

END {   # END section is executed once after last line
    asort(a)   # sort our array 'a' by it's values
    for(i in a) print a[i]   # this loop goes through sorted array and prints it's content
}

通过

使用此功能

./parse.awk input.txt

这是另一个版本，除了最后一个字段之外，它将比较所有行：


#!/usr/bin/awk -f

BEGIN {   # BEGIN section is executed once before input file's content
    FS=","   # input field separator is comma (can be set with -F argument on command line)
}

{   # main section is executed on every input line
    idx=""   # reset index variable
    for(i=1;i<NF;++i) idx=idx $i   # join all but the last field to create index
    if(idx in a) {   # this checks is array 'a' already contain an element with index in first field
        a[idx]=a[idx] "," $NF   # if entry already exist, just concatenate last field of current row
    }
    else {   # if this line contains new entry
        a[idx]=$0   # add it as a new array element
    }
}

END {   # END section is executed once after last line
    asort(a)   # sort our array 'a' by values
    for(i in a) print a[i]   # this loop goes through sorted array and prints it's content
}

随意提出任何进一步的解释。

Answer 2

这可能适合你（GNU sed and sort）：

sort -nt, -k1,1 -k6,6 file | 
sed ':a;$!N;s/^\(\([^,]*,\).*\)\n\2.*,/\1,/;ta;P;D'

合并多行的一些数据

2 个答案: