我有2个固定长度的文件输入#1&输入#2。我想根据两个文件中位置37-50的值匹配行(pos 37-50在两个文件中都有相同的值)。
如果找到任何匹配记录,则根据公司代码&输入文件#1中的发票编号(位置99到行尾)。
剪切字符串(来自输入#1)需要附加在记录/行的末尾。
下面是我尝试过的代码(不工作)和输入文件&期望的输出。请提供您的建议。
代码:
awk '
NR==FNR && NF>1 {
v=substr($0,37,14);
#print substr($0,37,14)
next
}
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
sub(/Company Code/,"",$0);
sub(/Invoice Number/,"",$0);
a[v]=$0;
print $0
next
}
(substr($0,37,14) in a) {
print $0 a[substr($0,99)]
}' Input1.txt input2.txt input3.txt
结束代码
输入#1开始使用一些空格开始
612 1111111111201402120000 2 1 111 211 Due Date 20140101
612 1111111111201402120000 2 1 111 311 Company Code 227
612 1111111111201402120000 2 1 111 411 Item Code 12
612 1111111111201402120000 2 1 111 511 Invoice Number 2014010
612 1111111111201402120000 2 2 111 611 Company Code 214
612 1111111111201402120000 2 2 111 711 Item Code 20
612 1111111111201402120000 2 2 111 811 Invoice Number 3014010
612 1111111111201402120000 2 3 111 911 Due Date 20140101
612 1111111111201402120000 2 3 111 111 Invoice Number 40140101
612 1111111111201402120000 2 3 111 121 user code 15563263636
612 1111111111201402120000 2 3 111 131 Amount Due 100000
612 111111111120140212000078978982123444 111 141 Due Date 20140101
612 111111111120140212000078978982123444 111 151 Invoice Number 50140101
612 111111111120140212000078978982123444 111 161 Amount Due 008000
输入#1结束
输入#2开头 输入2
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D
510 77432201111010000 2 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C
510 77432201111010000 2 2 2INv 20111101.510.77432.20005D
510 77432201111010000 2 2 3INv 20111101.510.77432.20006D
510 77432201111010000 2 2 4INv 20111101.510.77432.20007D
510 77432201111010000 2 2 5INv 20111101.510.77432.20008D
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C
510 77432201111010000 2 3 2INv 20111101.510.77432.20009D
510 77432201111010000 2 3 3INv 20111101.510.77432.20010D
510 77432201111010000 2 3 4INv 20111101.510.77432.20011D
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D
输入#2结束
渴望外出 期望的输出
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C 2272014010 (company & Inv # from input 1)
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D 2272014010
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D 2272014010
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D (company & Inv # from input 1)
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D (company & Inv # from input 1)
510 77432201111010000 2 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C (company & Inv # from input 1)
510 77432201111010000 2 2 2INv 20111101.510.77432.20005D (company & Inv # from input 1)
510 77432201111010000 2 2 3INv 20111101.510.77432.20006D (company & Inv # from input 1)
510 77432201111010000 2 2 4INv 20111101.510.77432.20007D (company & Inv # from input 1)
510 77432201111010000 2 2 5INv 20111101.510.77432.20008D (company & Inv # from input 1)
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C (company & Inv # from input 1)
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D <there is no matching record in input 1, this will be blank>
答案 0 :(得分:0)
尝试类似(未经测试):
的内容awk '
NR==FNR && /Company Code/ {
cc[$3,$4] = $NF;
next;
}
NR==FNR && /Invoice Number/ {
inv[$3,$4] = $NF;
next;
}
NR==FNR {next}
{print $0 FS cc[$3,$4] inv[$3,$4]}' input1 input2
答案 1 :(得分:0)
您的awk
代码有几个问题。
让我们一步一步地完成它们:
NR==FNR && NF>1 {...;next}NR==FNR && ...
- &gt; next
将阻止对除第一个记录之外的所有操作执行
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
- &gt; OR
不是有效的awk
语句,逻辑OR 是使用||
完成的(就像您使用&&
而不是AND
一样)。
print $0 a[substr($0,99)]
- &gt; a[substr($0,99)]
从第二个输入文件中的记录的第99个位置获取所有内容,以查找数组,但您的密钥是37-50。
我们可以用以下方式解决它们:
摆脱第一个操作中的next
,并将第三个操作限制为来自第二个记录输入文件。
用OR
替换||
。
使用substr($0,37,14)
作为查询a
和substr(...,99)
结果的关键。
这会生成以下代码(删除诊断print
命令和未使用的第三个输入文件):
awk '
NR==FNR && NF>1 {
v=substr($0,37,14);
}
NR==FNR && ( /Company Code/ || /Invoice Number/ ) {
sub(/Company Code/,"",$0);
sub(/Invoice Number/,"",$0);
a[v]=$0;
next
}
NR!=FNR && (substr($0,37,14) in a) {
print $0 substr(a[substr($0,37,14)],99)
}' input1.txt input2.txt
由于您的输入已关闭,我无法重现您想要的输出,但我希望您可以从此处了解。
另外,我将您的代码缩短为以下版本,根据给定的输入执行我认为您希望它执行的操作:
awk '
{key=substr($0,37,14)}
NR==FNR{
if(/Company Code/||/Invoice Number/)array[key]=substr($0,98)
next
}
(key in array){print $0,array[key]}
' input1.txt input2.txt
如果您需要调整/解释,请随时发表评论。