我有一个CSV文件,在第一列中包含该基因的名称,在下一列中包含其响应的表达式
示例
ABC1,Heart
ABC1,Brain
ABC1,Kidney
BRAC1,Heart
BRAC1,Lungs
RHO,Eye
RHO,Kidney
RPE65,Eye
必需的输出
ABCA1,Heart;Brain;Kidney
BRAC1,Heart;Lungs
RHO,Eye;Kidney
RPE65,Eye
我想用任何分隔符将它们分开,以显示表达式
答案 0 :(得分:3)
这一单线将进行“分组”:
h2
如果要对输出进行排序,则将结果通过管道输送到h2 + p:first-of-type{
margin-top:10px;
}
,例如awk -F, '{a[$1]=a[$1](a[$1]?";":"")$2}
END{for(x in a)print x FS a[x]}' file
答案 1 :(得分:1)
另一个awk。这取决于要排序的数据:
$ awk -F, '{printf "%s",($1==p?";"$2:ors $0);p=$1;ors=ORS}END{print ""}' <(sort -r file)
解释:
$ awk -F, '{ # set separator
printf "%s",($1==p?";"$2:ors $0) # conditional output, depends on $1 changing
p=$1 # remember $1 for next round
ors=ORS # lazy initialization for leading ORS removal
}
END {
print "" # cleanup the last output
}' <(sort file)
输出:
ABC1,Brain;Heart;Kidney
BRAC1,Heart;Lungs
RHO,Eye;Kidney
RPE65,Eye
答案 2 :(得分:1)
另一个awk
awk -F, ' { if($1==p) { printf(";%s",$2);next} printf("%s%s",NR==1? "" :"\n",$0);p=$1 } END { print "" } ' file
具有给定的输入
$ cat manoj.txt
ABC1,Heart
ABC1,Brain
ABC1,Kidney
BRAC1,Heart
BRAC1,Lungs
RHO,Eye
RHO,Kidney
RPE65,Eye
$ awk -F, ' { if($1==p) { printf(";%s",$2);next} printf("%s%s",NR==1? "" :"\n",$0);p=$1 } END { print "" } ' manoj.txt
ABC1,Heart;Brain;Kidney
BRAC1,Heart;Lungs
RHO,Eye;Kidney
RPE65,Eye
$
答案 3 :(得分:0)
awk -F, '{printf "%s",$1==l?";"$2:(FNR != 1)?RS $0:$0;l=$1}END{print ""}' file
输出
ABC1,Heart;Brain;Kidney
BRAC1,Heart;Lungs
RHO,Eye;Kidney
RPE65,Eye
注意:假定排序的输入