awk行到多列

时间:2018-05-23 15:10:24

标签: awk multiple-columns transformation lines

用于获取行对齐的输入文本采用此格式

(LINE_A)是文件的名称,例如LINE_A放在目录xy中 该文件内部是

file:G_VALUEFX:D_VALUEFX;SEAT01

正在返回

这个

7 LINE_A G_VALUEFX D_VALUEFX SEAT01 SEAT02 SEAT03 SEAT04 

(第1列中的数字是在线总列数的返回数)

我需要这些线的帮助来从这一行转换它们

e.g。

7 LINE_A G_VALUEFX D_VALUEFX SEAT01 SEAT02 SEAT03 SEAT04      
7 LINE_B G_VALUEFX D_VALUEFX SEAT22 SEAT25 SEAT27 SEAT30      
7 LINE_A G_VALUEFA D_VALUEFA SEAT01 SEAT02 SEAT03 SEAT04      
7 LINE_B G_VALUEFA D_VALUEFA SEAT22 SEAT25 SEAT27 SEAT30      

到列

7 LINE_A    7 LINE_B    7 LINE_A     7 LINE_B 
G_VALUEFX   G_VALUEFX   G_VALUEFA    G_VALUEFA 
D_VALUEFX   D_VALUEFX   D_VALUEFA    D_VALUEFA 
SEAT01      SEAT22      SEAT01       SEAT22 
SEAT02      SEAT25      SEAT02       SEAT25 
SEAT03      SEAT27      SEAT03       SEAT27 
SEAT04      SEAT30      SEAT04       SEAT30 

(我不确定是否有可能将其转换为以这种方式对齐列的方式)

7 LINE_A   |  7 LINE_B   | 7 LINE_A   |  7 LINE_B 
G_VALUEFX  |  G_VALUEFX  | G_VALUEFA  |  G_VALUEFA 
D_VALUEFX  |  D_VALUEFX  | D_VALUEFA  |  D_VALUEFA 
SEAT01     |  SEAT22     | SEAT01     |  SEAT22 
SEAT02     |  SEAT25     | SEAT02     |  SEAT25 
SEAT03     |  SEAT27     | SEAT03     |  SEAT27 
SEAT04     |  SEAT30     | SEAT04     |  SEAT30 

可能会发生某些行应该更长和更短的情况,例如

7 LINE_A G_VALUEFX D_VALUEFX SEAT01 SEAT02 SEAT03 SEAT04      
7 LINE_B G_VALUEFX D_VALUEFX SEAT22 SEAT25 SEAT27 SEAT30      
7 LINE_A G_VALUEFA D_VALUEFA SEAT01 SEAT02 SEAT03 SEAT04      
7 LINE_B G_VALUEFA D_VALUEFA SEAT22 SEAT25 SEAT27 EXNUM899999SSSSS9S8S5S2S8    
7 LINE_C G_PREFX D_VALUEFX SEAT01 SEAT02 SEAT03 SEAT04      
8 LINE_G G_PREFX D_VALUEFX POSITION55 POSITION82 VALUE85 POSITION44 POSITION448
7 LINE_C G_PREFA D_VALUEFA SEAT01 SEAT02 SEAT03       
4 LINE_H G_PREFA D_VALUEFA SEAT22
5 LINE_H G_NAMEA D_EXPIRY5 SEAT01 SEAT02 
3 LINE_H G_NAMEA D_EXPIRY5 
7 LINE_B G_NAMEY D_EXPIRY1 SEAT22 SEAT25 SEAT27 EXNUM899999SSSSS9S8S5S2S8     

然后输出可能看起来像这样(给定行数=更多列对齐/放置)如果可能的话,使用列分隔符“|” 所有时间都应该有第一个放置的数字,LINE_A / B第二个后跟G前缀;第三个D前缀休息是具有随机信息的值 (如果更方便的话,在“LINE_A / B”之前不应该有数字)

7 LINE_A    7 LINE_B     7 LINE_A      7 LINE_B                    7 LINE_C  8 LINE_G     7 LINE_C   4 LINE_H   5 LINE_H   3 LINE_H  7 LINE_B 
G_VALUEFX   G_VALUEFX    G_VALUEFA     G_VALUEFA                   G_PREFX   G_PREFX      G_PREFA    G_PREFA    G_NAMEA    G_NAMEA   G_NAMEY 
D_VALUEFX   D_VALUEFX    D_VALUEFA     D_VALUEFA                   D_VALUEFX D_VALUEFX    D_VALUEFA  D_VALUEFA  D_EXPIRY5  D_EXPIRY5 D_EXPIRY1
SEAT01      SEAT22       SEAT01        SEAT22                      SEAT01    POSITION55   SEAT01     SEAT22     SEAT01               SEAT22 
SEAT02      SEAT25       SEAT02        SEAT25                      SEAT02    POSITION82   SEAT02                SEAT02               SEAT25 
SEAT03      SEAT27       SEAT03        SEAT27                      SEAT03    VALUE85      SEAT03                                     SEAT27 
SEAT04      SEAT30       SEAT04        EXNUM899999SSSSS9S8S5S2S8   SEAT04    POSITION44                                              EXNUM899999SSSSS9S8S5S2S8 
                                                                             POSITION448  

谢谢

1 个答案:

答案 0 :(得分:0)

awk救援!

$ awk -v OFS=' | ' '{a[NR,1]=$1 FS $2; 
                     for(j=3;j<=NF;j++) a[NR,j-1]=$j; 
                     maxNF=maxNF<NF?NF:maxNF}
                END {for(i=1;i<maxNF;i++)
                       for(j=1;j<=NR;j++)
                          printf "%s",a[j,i] (j==NR?ORS:OFS)}' file | column -ts'|'

7 LINE_A     7 LINE_B     7 LINE_A     7 LINE_B                     7 LINE_C     8 LINE_G       7 LINE_C     4 LINE_H     5 LINE_H     3 LINE_H     7 LINE_B
G_VALUEFX    G_VALUEFX    G_VALUEFA    G_VALUEFA                    G_PREFX      G_PREFX        G_PREFA      G_PREFA      G_NAMEA      G_NAMEA      G_NAMEY
D_VALUEFX    D_VALUEFX    D_VALUEFA    D_VALUEFA                    D_VALUEFX    D_VALUEFX      D_VALUEFA    D_VALUEFA    D_EXPIRY5    D_EXPIRY5    D_EXPIRY1
SEAT01       SEAT22       SEAT01       SEAT22                       SEAT01       POSITION55     SEAT01       SEAT22       SEAT01                    SEAT22
SEAT02       SEAT25       SEAT02       SEAT25                       SEAT02       POSITION82     SEAT02                    SEAT02                    SEAT25
SEAT03       SEAT27       SEAT03       SEAT27                       SEAT03       VALUE85        SEAT03                                              SEAT27
SEAT04       SEAT30       SEAT04       EXNUM899999SSSSS9S8S5S2S8    SEAT04       POSITION44                                                         EXNUM899999SSSSS9S8S5S2S8
                                                                                 POSITION448

如果删除管道column格式化程序,输出将使用管道字符分隔(但不对齐)。