在unix中的两个固定格式文件中查找字段值 - 不起作用

时间:2014-02-26 00:53:13

标签: shell unix awk

我有2个固定长度的文件输入#1&输入#2。我想根据两个文件中位置37-50的值匹配行(pos 37-50在两个文件中都有相同的值)。

如果找到任何匹配记录,则根据公司代码&输入文件#1中的发票编号(位置99到行尾)。

剪切字符串(来自输入#1)需要附加在记录/行的末尾。

下面是我尝试过的代码(不工作)和输入文件&期望的输出。请提供您的建议。

代码:

awk '
NR==FNR && NF>1 {
    v=substr($0,37,14);
#print substr($0,37,14)
    next
}
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
    sub(/Company Code/,"",$0);
    sub(/Invoice Number/,"",$0);
    a[v]=$0;
print $0
    next
}
(substr($0,37,14) in a) {
    print $0 a[substr($0,99)]
}' Input1.txt input2.txt input3.txt

结束代码

输入#1开始使用一些空格开始

         612  1111111111201402120000       2     1  111  211 Due Date                             20140101                           
         612  1111111111201402120000       2     1  111  311 Company Code                         227                                
         612  1111111111201402120000       2     1  111  411 Item Code                            12                                 
         612  1111111111201402120000       2     1  111  511 Invoice Number                       2014010                            
         612  1111111111201402120000       2     2  111  611 Company Code                         214                                
         612  1111111111201402120000       2     2  111  711 Item Code                            20                                 
         612  1111111111201402120000       2     2  111  811 Invoice Number                       3014010                            
         612  1111111111201402120000       2     3  111  911 Due Date                             20140101                           
         612  1111111111201402120000       2     3  111  111 Invoice Number                       40140101                           
         612  1111111111201402120000       2     3  111  121 user code                            15563263636                        
         612  1111111111201402120000       2     3  111  131 Amount Due                           100000                             
         612  111111111120140212000078978982123444  111  141 Due Date                             20140101                             
         612  111111111120140212000078978982123444  111  151 Invoice Number                       50140101                             
         612  111111111120140212000078978982123444  111  161 Amount Due                          008000                             

输入#1结束

输入#2开头 输入2

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C                         
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D                         
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D                         
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D                         
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D                         
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C                         
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D                         
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D                         
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D                         
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D                         
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C                         
         510       77432201111010000       2     3        2INv                                                                   20111101.510.77432.20009D                         
         510       77432201111010000       2     3        3INv                                                                   20111101.510.77432.20010D                         
         510       77432201111010000       2     3        4INv                                                                   20111101.510.77432.20011D                         
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C                         
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D                         
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D                         
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D                         
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D                         

输入#2结束

渴望外出 期望的输出

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C   2272014010 (company & Inv # from input 1)                     
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D   2272014010                                            
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D   2272014010                                            
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D   (company & Inv # from input 1)                      
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D   (company & Inv # from input 1)                      
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C   (company & Inv # from input 1)                      
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D   <there is no matching record in input 1, this will be blank>                      

2 个答案:

答案 0 :(得分:0)

尝试类似(未经测试)

的内容
awk '
NR==FNR && /Company Code/ {
    cc[$3,$4] = $NF;
    next;
}
NR==FNR && /Invoice Number/ {
    inv[$3,$4] = $NF;
    next;
}
NR==FNR {next}
{print $0 FS cc[$3,$4] inv[$3,$4]}' input1 input2

答案 1 :(得分:0)

您的awk代码有几个问题。

让我们一步一步地完成它们:

  1. NR==FNR && NF>1 {...;next}NR==FNR && ... - &gt; next将阻止对除第一个记录之外的所有操作执行

  2. NR==FNR && ( /Company Code/ OR /Invoice Number/ ) { - &gt; OR不是有效的awk语句,逻辑OR 是使用||完成的(就像您使用&&而不是AND一样)。

  3. print $0 a[substr($0,99)] - &gt; a[substr($0,99)]从第二个输入文件中的记录的第99个位置获取所有内容,以查找数组,但您的密钥是37-50。

  4. 我们可以用以下方式解决它们:

    1. 摆脱第一个操作中的next,并将第三个操作限制为来自第二个记录输入文件。

    2. OR替换||

    3. 使用substr($0,37,14)作为查询asubstr(...,99)结果的关键。

    4. 这会生成以下代码(删除诊断print命令和未使用的第三个输入文件):

      awk '
      NR==FNR && NF>1 {
          v=substr($0,37,14);
      }
      NR==FNR && ( /Company Code/ || /Invoice Number/ ) {
          sub(/Company Code/,"",$0);
          sub(/Invoice Number/,"",$0);
          a[v]=$0;
          next
      }
      NR!=FNR && (substr($0,37,14) in a) {
          print $0 substr(a[substr($0,37,14)],99)
      }' input1.txt input2.txt
      

      由于您的输入已关闭,我无法重现您想要的输出,但我希望您可以从此处了解。

      另外,我将您的代码缩短为以下版本,根据给定的输入执行我认为您希望它执行的操作:

      awk '
      {key=substr($0,37,14)}
      NR==FNR{
        if(/Company Code/||/Invoice Number/)array[key]=substr($0,98)
        next
      }
      (key in array){print $0,array[key]}
      ' input1.txt input2.txt
      

      如果您需要调整/解释,请随时发表评论。