awk向下填充并生成由&分隔的间隔和&&

时间:2014-07-25 15:01:12

标签: awk

想要读取第一个字段,然后根据“& - ”和“&& - ”增加间隔。

Ex: If Digits field is  210&-3 ,  need to populate 210 and 213 only.
    If Digits field is  210&&-3 , need to populate 210,211,212 and 213.

示例Input.txt

DIGITS                   AL DEST         CHI CNT NEDEST       CORG  NCHA   
20                        0 ABC          1   N   ABC000       0     CHARGE      
                          1 ABC          1   N   ABC111       0     CHARGE      
                          2 ABC          1   N   ABC222       0     CHARGE      
                          3 ABC          1   N   ABC333       0     CHARGE      
                          4 ABC          1   N   ABC444       0     CHARGE      
210&-2                    0 ABC          1   N   ABC000       0     CHARGE      
                          1 ABC          1   N   ABC111       0     CHARGE      
                          2 ABC          1   N   ABC222       0     CHARGE      
                          3 ABC          1   N   ABC333       0     CHARGE      
                          4 ABC          1   N   ABC444       0     CHARGE      
2130&&-3&-6&&-8           0 ABC          1   N   ABC000       0     CHARGE      
                          1 ABC          1   N   ABC111       0     CHARGE      
                          2 ABC          1   N   ABC222       0     CHARGE      
                          3 ABC          1   N   ABC333       0     CHARGE      
                          4 ABC          1   N   ABC444       0     CHARGE 

期望的输出:

DIGITS                   AL DEST         CHI CNT NEDEST       CORG  NCHA   
20                        0 ABC          1   N   ABC000       0     CHARGE      
20                        1 ABC          1   N   ABC111       0     CHARGE      
20                        2 ABC          1   N   ABC222       0     CHARGE      
20                        3 ABC          1   N   ABC333       0     CHARGE      
20                        4 ABC          1   N   ABC444       0     CHARGE      
210                       0 ABC          1   N   ABC000       0     CHARGE      
210                       1 ABC          1   N   ABC111       0     CHARGE      
210                       2 ABC          1   N   ABC222       0     CHARGE      
210                       3 ABC          1   N   ABC333       0     CHARGE      
210                       4 ABC          1   N   ABC444       0     CHARGE      
212                       0 ABC          1   N   ABC000       0     CHARGE      
212                       1 ABC          1   N   ABC111       0     CHARGE      
212                       2 ABC          1   N   ABC222       0     CHARGE      
212                       3 ABC          1   N   ABC333       0     CHARGE      
212                       4 ABC          1   N   ABC444       0     CHARGE      
2130                  0 ABC          1   N   ABC000       0     CHARGE      
2130                      1 ABC          1   N   ABC111       0     CHARGE      
2130                      2 ABC          1   N   ABC222       0     CHARGE      
2130                      3 ABC          1   N   ABC333       0     CHARGE      
2130                      4 ABC          1   N   ABC444       0     CHARGE 
2131                      0 ABC          1   N   ABC000       0     CHARGE      
2131                      1 ABC          1   N   ABC111       0     CHARGE      
2131                      2 ABC          1   N   ABC222       0     CHARGE      
2131                      3 ABC          1   N   ABC333       0     CHARGE      
2131                      4 ABC          1   N   ABC444       0     CHARGE 
2132                      0 ABC          1   N   ABC000       0     CHARGE      
2132                      1 ABC          1   N   ABC111       0     CHARGE      
2132                      2 ABC          1   N   ABC222       0     CHARGE      
2132                      3 ABC          1   N   ABC333       0     CHARGE      
2132                      4 ABC          1   N   ABC444       0     CHARGE 
2133                      0 ABC          1   N   ABC000       0     CHARGE      
2133                      1 ABC          1   N   ABC111       0     CHARGE      
2133                      2 ABC          1   N   ABC222       0     CHARGE      
2133                      3 ABC          1   N   ABC333       0     CHARGE      
2133                      4 ABC          1   N   ABC444       0     CHARGE 
2136                      0 ABC          1   N   ABC000       0     CHARGE      
2136                      1 ABC          1   N   ABC111       0     CHARGE      
2136                      2 ABC          1   N   ABC222       0     CHARGE      
2136                      3 ABC          1   N   ABC333       0     CHARGE      
2136                      4 ABC          1   N   ABC444       0     CHARGE 
2137                      0 ABC          1   N   ABC000       0     CHARGE      
2137                      1 ABC          1   N   ABC111       0     CHARGE      
2137                      2 ABC          1   N   ABC222       0     CHARGE      
2137                      3 ABC          1   N   ABC333       0     CHARGE      
2137                      4 ABC          1   N   ABC444       0     CHARGE 
2138                      0 ABC          1   N   ABC000       0     CHARGE      
2138                      1 ABC          1   N   ABC111       0     CHARGE      
2138                      2 ABC          1   N   ABC222       0     CHARGE      
2138                      3 ABC          1   N   ABC333       0     CHARGE      
2138                      4 ABC          1   N   ABC444       0     CHARGE 

有想法在开始和结束基本级别生成连续序列,而不是在这个复杂级别。谷歌搜索很多找到类似的解决方案,但没有运气。任何建议......

1 个答案:

答案 0 :(得分:2)

一个想法是使用第一列中的数字代码将数据拆分为多行记录作为记录分隔符(需要正则表达式RS,如gawk中所示)。所以一个记录的例子是

Record separator (stored in RT):
2130&&-3&-6&&-8

Record:
           0 ABC          1   N   ABC000       0     CHARGE      
                          1 ABC          1   N   ABC111       0     CHARGE      
                          2 ABC          1   N   ABC222       0     CHARGE      
                          3 ABC          1   N   ABC333       0     CHARGE      
                          4 ABC          1   N   ABC444       0     CHARGE 

写一个打印功能来打印带有数字前缀的行,并为每行应该加上前缀的数字调用一次。

要计算前缀数,请编写一个函数,对记录分隔符文本中的数字代码进行操作,计算下一个前缀号(number)以及数字代码的新值({{1} })。 numcodenumber都可以是全球性的。对于循环控制,如果没有更多的前缀号,该函数应返回0; 1否则。

计算numcodenumber的规则是:

numcode

示例:

if numcode is the empty string, return 0.

set number to initial digits in numcode (before any ampersands)

if numcode is just a number:
   set numcode to the empty string
if number in numcode has one ampersand after it:
   change the last digit of the number in the numcode to the number after the first dash
   remove the first &-n substring
if number in numcode has two ampersands after it:
   change the last digit of the number in the numcode by adding one to it
   if it's equal to the number after the first dash
     remove the first &&-n substring

return 1

扩展示例:

numcode in   number is    numcode out     Returns
""                                        0
120          120          ""              1
120&-2       120          122             1
120&-2&-4    120          122&-4          1
120&&-3      120          121&&-3         1
121&&-3      121          122&&-3         1
122&&-3      122          123             1

骨架代码:

numcode in       number   numcode out       Returns
120&&-2&-5&&-7   120      121&&-2&-5&&-7    1
121&&-2&-5&&-7   121      122&-5&&-7        1
122&-5&&-7       122      125&&-7           1
125&&-7          125      126&&-7           1
126&&-7          126      127               1
127              127      ""                1
""               --       --                0