Question

我希望你能解决我的问题。

我有一个包含3列数据的输入文件，如下所示：

Apl_No Act_No Sfx_No 
100    10     0
100    11     1
100    12     2
100    13     3
101    20     0
101    21     1

我需要创建一个输出文件，其中包含输入中的数据和输出中的3个附加文件。它应该是这样的：

Apl_No Act_No Sfx_No Crt_Act_No Prs_Act_No Cd_Act_No
100    10     0       -         -          -
100    11     1       10        11         12
100    12     2       11        12         13
100    13     3       12        13         10
101    20     0       -         -          -
101    21     1       20        21         20

每个Apl_No都有一组Act_No映射到它。需要创建3个新字段：Crt_Act_No Prs_Act_No Cd_Act_No。遇到第一个唯一Apl_No时，需要删除列值4,5和6（Crt_Act_No Prs_Act_No Cd_Act_No）。对于相同Apl_No的每个后续出现，Crt_Act_No与前一行中的Act_No相同，Prs_Act_No与Act_No上的Cd_Act_No相同当前行和Act_No与下一行的Apl_No相同。对于除最后一行之外具有相同Crt_Act_No的所有后续行，这将继续。在最后一行中，Prs_Act_No和Cd_Act_No的填充方式与上述行相同，但Act_No需要从第一行的Apl_No中提取遇到第一个唯一的{{1}}。

我希望用awk实现这一点。任何人都可以帮我解决这个问题。

Answer 1

一个解决方案：

awk '
    ## Print header in first line.
    FNR == 1 {
        printf "%s %s %s %s\n", $0, "Crt_Act_No", "Prs_Act_No", "Cd_Act_No";
        next;
    }

    ## If first field not found in the hash means that it is first unique "Apl_No", so
    ## print line with dashes and save some data for use it later.
    ## "line" variable has the content of the previous iteration. Print it if it is set.
    ! apl[ $1 ] {
        if ( line ) {
            sub( /-/, orig_act, line );
            print line;
            line = "";
        }
        printf "%s %s %s %s\n", $0, "-", "-", "-";
        orig_act = prev_act = $2;
        apl[ $1 ] = 1;
        next;
    }

    ## For all non-unique "Apl_No"... 
    {
        ## If it is the first one after the line with
        ## dashes (line not set) save it is content in "line" and the variable
        ## that I will have to check later ("Act_No"). Note that I leave a dash in last
        ## field to substitute in the following iteration.
        if ( ! line ) {
            line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
            prev_act = $2;
            next;
        }

        ## Now I know the field, so substitute the dash with it, print and repeat
        ## the process with current line.
        sub( /-/, $2, line );
        print line;
        line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
        prev_act = $2;
    }
    END {
        if ( line ) {
            sub( /-/, orig_act, line );
            print line;
        }        
    }
' infile | column -t

产量：

Apl_No  Act_No  Sfx_No  Crt_Act_No  Prs_Act_No  Cd_Act_No
100     10      0       -           -           -
100     11      1       10          11          12
100     12      2       11          12          13
100     13      3       12          13          10
101     20      0       -           -           -
101     21      1       20          21          20

读取上一行和下一行的列值，但使用awk将它们作为附加字段插入当前行

1 个答案: