使用两个变量将sed更改为值

时间:2018-07-27 07:09:15

标签: bash awk sed

首先,我不得不说我尝试了此处发布的其他解决方案,但是没有一个可以解决我遇到的问题。抱歉,如果与另一个相似,但我无法使其正常工作。我有一个像这样的文件:

chr17   7579366 COSM45509;COSM11448;COSM45040   G   A,C,T   13.2    PASS    AF=0.0216216,0,0;AO=4,0,0;DP=185;FAO=4,0,0;FDP=185;FDVR=5,5,5;FR=.,.,.,REALIGNEDx0.03243;FRO=181;FSAF=3,0,0;FSAR=1,0,0;FSRF=136;FSRR=45;FUNC=[{'origPos':'7579366','origRef':'G','normalizedRef':'G','gene':'TP53','normalizedPos':'7579366','normalizedAlt':'A','gt':'pos','codon':'TAT','coding':'c.321C>T','transcript':'NM_000546.5','function':'synonymous','protein':'p.(=)','location':'exonic','origAlt':'A','exon':'4','CLNACC1':'RCV000220860','CLNSIG1':'Likely_benign','CLNREVSTAT1':'single','CLNID1':'rs770776262'}];FWDB=0.0168253,-0.0508378,0.0146373;FXX=0;HRUN=1,1,1;HS;HS_ONLY=0;LEN=1,1,1;MLLD=80.6954,128.354,137.413;OALT=A,C,T;OID=COSM45509,COSM11448,COSM45040;OMAPALT=A,C,T;OPOS=7579366,7579366,7579366;OREF=G,G,G;PB=.,.,.;PBP=.,.,.;QD=0.285514;RBI=0.0283383,0.0582477,0.0324614;REFB=0.000208231,-0.000390718,-0.000240087;REVB=0.0228028,-0.0284309,-0.028974;RO=181;SAF=3,0,0;SAR=1,0,0;SRF=136;SRR=45;SSEN=0,0,0;SSEP=0,0,0;SSSB=-0.00109306,0,0;STB=0.501796,0.5,0.5;STBP=0.98,1,1;TYPE=snp,snp,snp;VARB=-0.00120684,0,0;AF_gnomAD=0;cosmic_ids=COSM2745025,COSM5055813,COSM2745026,COSM4272039,COSM5055814,COSM5055812,COSM11448,COSM45509,COSM2745027,COSM45040,COSM4487692,COSM4487691,COSM213589,COSM5055811,COSM213590 GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:13:185:185:181:181:4:4:0.0216215997934341:1:3:136:45:1:3:136:45
chr18   48575168    COSM14216;COSM25274 T   TA  41.6    PASS    AF=0.0178891;AO=92;DP=3963;FAO=70;FDP=3913;FDVR=0;FR=.;FRO=3843;FSAF=43;FSAR=27;FSRF=2160;FSRR=1683;FUNC=[{'origPos':'48575168','origRef':'T','normalizedRef':'T','gene':'SMAD4','normalizedPos':'48575168','normalizedAlt':'TA','gt':'pos','codon':'ATG','coding':'c.366_367insA','transcript':'NM_005359.5','function':'frameshiftInsertion','protein':'p.Cys123fs','location':'exonic','origAlt':'TA','exon':'3'}];FWDB=0.0820535;FXX=0.0098684;HRUN=4;HS;HS_ONLY=0;LEN=1;MLLD=27.4213;OALT=A,A;OID=COSM14216,COSM25274;OMAPALT=TA,TA;OPOS=48575170,48575173;OREF=-,-;PB=.;PBP=.;QD=0.0425701;RBI=0.0857718;REFB=-0.00250641;REVB=0.0249804;RO=3838;SAF=50;SAR=42;SRF=2167;SRR=1671;SSEN=0;SSEP=0;SSSB=-0.0142007;STB=0.552796;STBP=0.415;TYPE=ins;VARB=0.109082;AF_gnomAD=0 GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:41:3963:3913:3838:3843:92:70:0.0178890991955996:42:50:2167:1671:27:43:2160:1683

chr18   48604749    rs377767375;COSM6784692 G   T   617.0   PASS    AF=0.107822;AO=153;DP=1423;FAO=153;FDP=1419;FDVR=5;FR=.,REALIGNEDx0.1326;FRO=1266;FSAF=143;FSAR=10;FSRF=1076;FSRR=190;FUNC=[{'origPos':'48604749','origRef':'G','normalizedRef':'G','gene':'SMAD4','normalizedPos':'48604749','normalizedAlt':'T','polyphen':'0.834','gt':'pos','codon':'TTG','coding':'c.1571G>T','sift':'0.0','grantham':'61.0','transcript':'NM_005359.5','function':'missense','protein':'p.Trp524Leu','location':'exonic','origAlt':'T','exon':'12','CLNACC1':'RCV000021747','CLNSIG1':'Pathogenic','CLNREVSTAT1':'no_criteria','CLNID1':'rs377767375'}];FWDB=0.0098166;FXX=0.0042105;HRUN=2;HS_ONLY=0;LEN=1;MLLD=57.8823;OALT=T;OID=.;OMAPALT=T;OPOS=48604749;OREF=G;PB=.;PBP=.;QD=1.73921;RBI=0.0271184;REFB=-0.00145864;REVB=-0.0252793;RO=1267;SAF=143;SAR=10;SRF=1078;SRR=189;SSEN=0;SSEP=0;SSSB=0.257732;STB=0.701118;STBP=0.001;TYPE=snp;VARB=0.0118254;AF_gnomAD=0;rs_ids=rs377767375;cosmic_ids=COSM6784692   GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:616:1423:1419:1267:1266:153:153:0.107822000980377:10:143:1078:189:10:143:1076:190

我得到了要使用的两个变量:

A=$(cat file | awk -F '\t' '{print$8}' | awk -F ';' '{print$1}' | awk -F '=' '{print$2}')
B=$(cat file | grep -v "#" | awk -F '\t' '{print$10}' | awk -F ":" '{print$9}'

我的目标是用变量$A的相应值来更改变量$B的每个值。我尝试直接执行sed -i,但抛出与variabñe内部空格有关的错误:sed: -e expression: unterminated s' command。我尝试使用for循环,但无法摆脱该错误。我最后一次尝试是这样的:

length=3
for ((i=0;i<=$length;i++)); do sed -i "s/"${key[$i]}"/"${value[$i]}"/g" file ; done

有什么想法吗?谢谢!

编辑:输出应将AF=之后的值更改为$B中的值,该值对应于由":"分隔的最后一个数字块中的第9个值。在此示例中:

chr17   7579366 COSM45509;COSM11448;COSM45040   G   A,C,T   13.2    PASS    AF=0.0216215997934341;AO=4,0,0;DP=185;FAO=4,0,0;FDP=185;FDVR=5,5,5;FR=.,.,.,REALIGNEDx0.03243;FRO=181;FSAF=3,0,0;FSAR=1,0,0;FSRF=136;FSRR=45;FUNC=[{'origPos':'7579366','origRef':'G','normalizedRef':'G','gene':'TP53','normalizedPos':'7579366','normalizedAlt':'A','gt':'pos','codon':'TAT','coding':'c.321C>T','transcript':'NM_000546.5','function':'synonymous','protein':'p.(=)','location':'exonic','origAlt':'A','exon':'4','CLNACC1':'RCV000220860','CLNSIG1':'Likely_benign','CLNREVSTAT1':'single','CLNID1':'rs770776262'}];FWDB=0.0168253,-0.0508378,0.0146373;FXX=0;HRUN=1,1,1;HS;HS_ONLY=0;LEN=1,1,1;MLLD=80.6954,128.354,137.413;OALT=A,C,T;OID=COSM45509,COSM11448,COSM45040;OMAPALT=A,C,T;OPOS=7579366,7579366,7579366;OREF=G,G,G;PB=.,.,.;PBP=.,.,.;QD=0.285514;RBI=0.0283383,0.0582477,0.0324614;REFB=0.000208231,-0.000390718,-0.000240087;REVB=0.0228028,-0.0284309,-0.028974;RO=181;SAF=3,0,0;SAR=1,0,0;SRF=136;SRR=45;SSEN=0,0,0;SSEP=0,0,0;SSSB=-0.00109306,0,0;STB=0.501796,0.5,0.5;STBP=0.98,1,1;TYPE=snp,snp,snp;VARB=-0.00120684,0,0;AF_gnomAD=0;cosmic_ids=COSM2745025,COSM5055813,COSM2745026,COSM4272039,COSM5055814,COSM5055812,COSM11448,COSM45509,COSM2745027,COSM45040,COSM4487692,COSM4487691,COSM213589,COSM5055811,COSM213590    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:13:185:185:181:181:4:4:0.0216215997934341:1:3:136:45:1:3:136:45
chr18   48575168    COSM14216;COSM25274 T   TA  41.6    PASS    AF=0.0178890991955996;AO=92;DP=3963;FAO=70;FDP=3913;FDVR=0;FR=.;FRO=3843;FSAF=43;FSAR=27;FSRF=2160;FSRR=1683;FUNC=[{'origPos':'48575168','origRef':'T','normalizedRef':'T','gene':'SMAD4','normalizedPos':'48575168','normalizedAlt':'TA','gt':'pos','codon':'ATG','coding':'c.366_367insA','transcript':'NM_005359.5','function':'frameshiftInsertion','protein':'p.Cys123fs','location':'exonic','origAlt':'TA','exon':'3'}];FWDB=0.0820535;FXX=0.0098684;HRUN=4;HS;HS_ONLY=0;LEN=1;MLLD=27.4213;OALT=A,A;OID=COSM14216,COSM25274;OMAPALT=TA,TA;OPOS=48575170,48575173;OREF=-,-;PB=.;PBP=.;QD=0.0425701;RBI=0.0857718;REFB=-0.00250641;REVB=0.0249804;RO=3838;SAF=50;SAR=42;SRF=2167;SRR=1671;SSEN=0;SSEP=0;SSSB=-0.0142007;STB=0.552796;STBP=0.415;TYPE=ins;VARB=0.109082;AF_gnomAD=0    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:41:3963:3913:3838:3843:92:70:0.0178890991955996:42:50:2167:1671:27:43:2160:1683

chr18   48604749    rs377767375;COSM6784692 G   T   617.0   PASS    AF=0.107822000980377;AO=153;DP=1423;FAO=153;FDP=1419;FDVR=5;FR=.,REALIGNEDx0.1326;FRO=1266;FSAF=143;FSAR=10;FSRF=1076;FSRR=190;FUNC=[{'origPos':'48604749','origRef':'G','normalizedRef':'G','gene':'SMAD4','normalizedPos':'48604749','normalizedAlt':'T','polyphen':'0.834','gt':'pos','codon':'TTG','coding':'c.1571G>T','sift':'0.0','grantham':'61.0','transcript':'NM_005359.5','function':'missense','protein':'p.Trp524Leu','location':'exonic','origAlt':'T','exon':'12','CLNACC1':'RCV000021747','CLNSIG1':'Pathogenic','CLNREVSTAT1':'no_criteria','CLNID1':'rs377767375'}];FWDB=0.0098166;FXX=0.0042105;HRUN=2;HS_ONLY=0;LEN=1;MLLD=57.8823;OALT=T;OID=.;OMAPALT=T;OPOS=48604749;OREF=G;PB=.;PBP=.;QD=1.73921;RBI=0.0271184;REFB=-0.00145864;REVB=-0.0252793;RO=1267;SAF=143;SAR=10;SRF=1078;SRR=189;SSEN=0;SSEP=0;SSSB=0.257732;STB=0.701118;STBP=0.001;TYPE=snp;VARB=0.0118254;AF_gnomAD=0;rs_ids=rs377767375;cosmic_ids=COSM6784692  GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR   0/1:616:1423:1419:1267:1266:153:153:0.107822000980377:10:143:1078:189:10:143:1076:190

1 个答案:

答案 0 :(得分:2)

您可以直接使用awk来执行此操作,而无需使用for循环或其他额外命令,

awk -F'[:| ]' '{a=$(NF-8); sub(/=[^;]*/,"="); sub(/^[^=]*=/,"&"a)}1' file

简要说明,

  • -F'[:| ]':将“:”和空格设置为字段分隔符。
  • a=$(NF-8):提取所需的字段进行替换,然后将值分配给'a'
  • sub(/=[^;]*/,"="):过滤掉第一个'='和';'
  • 之间的值
  • sub(/(^[^=]*=)/,"&"a):分配后面'='的值