如何在Linux中的列内换行

时间:2019-06-26 17:58:27

标签: linux bash awk

我有一个用逗号分隔的文件,正在格式化该文件,以使用printf创建2列。我正在使用awk将内容分组到相似的组中,以便可以将它们打印到格式正确的列中。

格式化有效,但是数组的内容会换行,而不是换行。

输入文件示例:

1,test,test1,test1
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2
2,test,test1,test2`

使用的命令:

awk -F"," 'NR>1 {a[$3]=a[$3] ? a[$3]", "$4" ("$2")" : $4" ("$2")"}
  END {for (i in a) {print i":"a[i]}}' test.dat |
sort |
awk -F":" 'BEGIN { printf "%-15s %-10s\n", "COLUMN1","COLUMN2"; printf "%-15s %-10s\n", "-----------","----------"}
  { printf "%-15s %-10s\n", $1,$2}'

我也知道并尝试使用column -t -s","pr

结果类似于(模拟示例):

COLUMN1     COLUMN2
========     =======
1            test1
2            test2, test2, test2, test2, test2, test2,test2, test2, test2,test2, test2, test2, test2, test2

如何包装第二列(如果它太长,即使第一列也是如此)以使其适合其框架?

COLUMN1     COLUMN2
========     =======
1            test1
2            test2, test2, test2, test2, test2, test2,test2, test2, 
             test2,test2, test2, test2, test2, test2

2 个答案:

答案 0 :(得分:2)

假设您发布的示例输入和您说得到的输出,就假装这是您原始脚本正在做的事情:

$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
{ vals[$1] = ($1 in vals ? vals[$1] ", " : "") $4 }
END {
    print "column1", "column2"
    print "=======", "======="

    for (key in vals) {
        print key, vals[key]
    }
}

$ awk -f tst.awk file
column1 column2
======= =======
1       test1
2       test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2, test2

这是您提出问题的一个很好的起点,现在您想包装每一列吗?如果是这样,那么我将利用foldfmt之类的现有UNIX工具为您进行包装,这样您就不必编写自己的代码来处理空格和中间空格的拆分。单词等:

$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
{ vals[$1] = ($1 in vals ? vals[$1] ", " : "") $4 }
END {
    print "column1", "column2"
    print "=======", "======="

    for (key in vals) {
        numKeyLines = wrap(key,15,keyArr)
        numValLines = wrap(vals[key],50,valArr)
        numLines = (numKeyLines > numValLines ? numKeyLines : numValLines)
        for (lineNr=1; lineNr<=numLines; lineNr++) {
            print keyArr[lineNr], valArr[lineNr]
        }
    }
}

function wrap(inStr,wid,outArr,         cmd,line,numLines) {
    if ( length(inStr) > wid ) {
        cmd = "printf \047%s\n\047 \"" inStr "\" | fold -s -w " wid+0
        while ( (cmd | getline line) > 0 ) {
            outArr[++numLines] = line
        }
        close(cmd)
    }
    else {
        outArr[++numLines] = inStr
    }
    return numLines+0
}

$ awk -f tst.awk file
column1 column2
======= =======
1       test1
2       test2, test2, test2, test2, test2, test2, test2,
        test2, test2, test2, test2, test2

如果您有很多需要包装的字段,那么由于每次调用fold都会生成一个子shell,所以它不会很快,所以这里是一个全awk版本,如果可能,请在空格处拆分,测试适用于边缘情况和按摩以适合:

$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
{ vals[$1] = ($1 in vals ? vals[$1] ", " : "") $4 }
END {
    print "column1", "column2"
    print "=======", "======="

    for (key in vals) {
        numKeyLines = wrap(key,15,keyArr)
        numValLines = wrap(vals[key],50,valArr)
        numLines = (numKeyLines > numValLines ? numKeyLines : numValLines)
        for (lineNr=1; lineNr<=numLines; lineNr++) {
            print keyArr[lineNr], valArr[lineNr]
        }
    }
}

function wrap(inStr,wid,outArr,         lineEnd,numLines) {
    while ( length(inStr) > wid ) {
        lineEnd = ( match(substr(inStr,1,wid),/.*[[:space:]]/) ? RLENGTH - 1 : wid )
        outArr[++numLines] = substr(inStr,1,lineEnd)
        inStr = substr(inStr,lineEnd+1)
        sub(/^[[:space:]]+/,"",inStr)
    }
    outArr[++numLines] = inStr
    return numLines
}

$ awk -f tst.awk file
column1 column2
======= =======
1       test1
2       test2, test2, test2, test2, test2, test2, test2,
        test2, test2, test2, test2, test2

答案 1 :(得分:0)

以下是使用perl而不是awk的版本:

#!/usr/bin/env perl
use warnings;
use strict;

my ($col1, $col4, @col4data);

print <<EOF;
COLUMN1     COLUMN2
=======     =======
EOF

{
  my $line = <>;
  chomp $line;
  ($col1, $col4data[0]) = (split /,/, $line)[0,3];
}

while (<>) {
  chomp;
  my ($c, $a) = (split /,/)[0,3];
  if ($c ne $col1) {
    $col4 = join ", ", @col4data;
    write;
    @col4data = ();
    $col1 = $c;
  }
  push @col4data, $a;
}

$col4 = join ", ", @col4data;
write;

format STDOUT =
@<<<<<<<    ^<<<<<<<<<<<<<<<<<<<<<<
$col1,      $col4
~~          ^<<<<<<<<<<<<<<<<<<<<<<
            $col4
.

示例:

$ perl columns.pl input.csv
COLUMN1     COLUMN2
=======     =======
1           test1
2           test2, test2, test2,
            test2, test2, test2,
            test2, test2, test2,
            test2, test2, test2

这里的魔力在于使用output format的填充模式进行换行。通过在<说明的明显部分中添加更多format来根据需要调整宽度。