我在下面有一个代码,它根据fileB中的数据替换fileA中的第4列,但输出没有保留原始文件的空格。反正有吗?
tr , " " <fileB | awk 'NR==FNR{a[$2]=$1;next} {$4=a[$4];print}' - fileA
的fileA
xxx xxx xxx Z0002
FILEB
3100,3000
W0002,Z0002
使用上面的代码输出:
xxx xxx xxx W0002
预期产出:
xxx xxx xxx W0002
答案 0 :(得分:1)
这应该做:
awk 'FNR==NR {split($0,a,",");b[a[2]]=a[1];next} {n=split($0,d,/[^[:space:]]*/);if(b[$4])$4=b[$4];for(i=1;i<=n;i++) printf("%s%s",d[i],$i);print ""}' fileB fileA
它将空格存储在一个数组中,以便以后可以重复使用
示例:
cat fileA
xxx xxx xxx Z0002 not change this
xxx xxx Z0002 zzz
xxx Z000223213 xxx Z0002 xxx xxx xxx Z0002
cat fileB
3100,3000
W0002,Z0002
awk 'FNR==NR {split($0,a,",");b[a[2]]=a[1];next} {n=split($0,d,/[^[:space:]]*/);if(b[$4])$4=b[$4];for(i=1;i<=n;i++) printf("%s%s",d[i],$i);print ""}' fileB fileA
xxx xxx xxx W0002 not change this
xxx xxx Z0002 zzz
xxx Z000223213 xxx W0002 xxx xxx xxx Z0002
更具可读性及其工作原理:
awk '
FNR==NR { # For the first file "fileB"
split($0,a,",") # Split it to an array "a" using "," as separator
b[a[2]]=a[1] # Store the data in array "b" using second column as index
next # Skip to next record
}
{ # Then for the file "fileA"
n=split($0,d,/[^[:space:]]*/) # Split the spaces inn group and store them in array "d"
if(b[$4]) # If array "b" as data for field "4"
$4=b[$4] # Change filed "4" to data found in array "b"
for(i=1;i<=n;i++) # Loop trough all field in the line
printf("%s%s",d[i],$i) # print correct separator and data
print "" # Add new line at the end
}
' fileB fileA # Read the files.
答案 1 :(得分:0)
使用gsub(正则表达式替换),前面有空格模式,行尾$
之后会解决问题。
测试文件:
$ cat fileA
xxx xxx xxx Z0002
xxx xxx Z0002 xxx
xxx xxx xxx Z0002YY
命令执行和结果:
$ tr , " " <fileB | awk 'NR==FNR{a[$2]=$1;next} a[$4]=="" {print} a[$4]!=""{gsub(" "$4"$", " "a[$4], $0);print}' - fileA
xxx xxx xxx W0002
xxx xxx Z0002 xxx
xxx xxx xxx Z0002YY
答案 2 :(得分:0)
长awk回答
对于这个问题,这有点矫枉过正,但我认为这对其他人有用。
它将避免元字符的问题以及线上其他地方出现的模式。
awk 'FNR==NR {split($0,a,",");b[a[2]]=a[1];next}
{
while(match(substr($0,x+=(RSTART+RLENGTH-(x>1?1:0))),"[^[:space:]]+")){
E[++D]=(RSTART+x-(x>1?1:0))
F[D]=E[D]+RLENGTH
}
}
b[$4]~/./{$0=substr($0,0,E[4]-1) b[$4] substr($0,F[4])}
{x=1;D=0;delete E}1' FILEB FILEA
<强>输入强>
FILEA
xxx Z0002 xxx Z0002 xxx xxx xxx Z0002
xxx Z0002 xxx dsasa xxx xxx xxx Z0002
FILEB
3100,3000
W0002,Z0002
<强>输出强>
xxx Z0002 xxx W0002 xxx xxx xxx Z0002
xxx Z0002 xxx dsasa xxx xxx xxx Z0002
稍后会添加