如何使用shell awk获取此数据集?

时间:2018-01-15 12:48:05

标签: shell awk

感谢您的关注。

原始数据

land_cover_classes  rows    columns LandCoverDist
"1 of 18"   "1 of 720"  "1 of 1440" 20
"1 of 18"   "1 of 720"  "2 of 1440" 0
"1 of 18"   "1 of 720"  "3 of 1440" 0
"10 of 18"  "1 of 720"  "4 of 1440" 1
"9 of 18"   "110 of 720"    "500 of 1440"   0
"1 of 18"   "1 of 720"  "6 of 1440" 354
"1 of 18"   "1 of 720"  "7 of 1440" 0
"1 of 18"   "1 of 720"  "8 of 1440" 0
"1 of 18"   "720 of 720"    "1440 of 1440"  0

预期应该是

land_cover_classes  rows    columns LandCoverDist
1   1   1   20
......
9   110 500 0
1   1   6   354
......
1   720 1440    0

3 个答案:

答案 0 :(得分:2)

Awk 解决方案:

awk 'BEGIN{ FS="\"[[:space:]]+"; OFS="\t" }
     function get_num(n){ 
         gsub(/^"| of.*/,"",n);
         return n 
     }
     NR==1; NR>1{ print get_num($1), get_num($2), get_num($3), $4 }' file

输出:

land_cover_classes  rows    columns LandCoverDist
1   1   1   20
1   1   2   0
1   1   3   0
10  1   4   1
9   110 500 0
1   1   6   354
1   1   7   0
1   1   8   0
1   720 1440    0

答案 1 :(得分:2)

$ awk '
BEGIN { FS="\" *\"?" }      
NR==1                                        # print header                 
{
    for(i=2;i<=NF;i++) {                     # starting from the second field
        split($i,a," of ")                   # split at _of_
        printf "%s%s", a[1], (i==NF?ORS:OFS) # print the first part and separator
    }
}' file
land_cover_classes  rows    columns LandCoverDist
1 1 1 20
1 1 2 0
1 1 3 0
10 1 4 1
9 110 500 0
1 1 6 354
1 1 7 0
1 1 8 0
1 720 1440 0

答案 2 :(得分:2)

$ awk -F'["[:space:]]+' 'NR>1{$0 = $2 OFS $5 OFS $8 OFS $11} 1' file
land_cover_classes  rows    columns LandCoverDist
1 1 1 20
1 1 2 0
1 1 3 0
10 1 4 1
9 110 500 0
1 1 6 354
1 1 7 0
1 1 8 0
1 720 1440 0

$ awk -F'["[:space:]]+' 'NR>1{$0 = $2 OFS $5 OFS $8 OFS $11} 1' file | column -t
land_cover_classes  rows  columns  LandCoverDist
1                   1     1        20
1                   1     2        0
1                   1     3        0
10                  1     4        1
9                   110   500      0
1                   1     6        354
1                   1     7        0
1                   1     8        0
1                   720   1440     0