使用Awk多行

时间:2014-09-15 02:15:50

标签: csv awk gawk

我有一个带有下划线分隔符的CSV。我有8行需要以这种方式转换为一行:

101_1_variableName_(value)
101_1_variableName1_(value2)

成:

101 1 (value) (value2)

(最好在不同的方框中)

问题是我不知道如何在awk中使用多行来形成一行。任何帮助表示赞赏。

更新:(输入+输出)

101_1_1_trialOutcome_found_berry            
101_1_1_trialStartTime_2014-08-05 11:26:49.510000           
101_1_1_trialOutcomeTime_2014-08-05 11:27:00.318000         
101_1_1_trialResponseTime_0:00:05.804000            
101_1_1_whichResponse_d         
101_1_1_bearPosition_6          
101_1_1_patch_9         
101_1_1_food_11 

(最后一部分全部一行)

101 1 1 found_berry 2014-08-05 11:26:49.510000 2014-08-05 11:27:00.318000 0:00:05.804000 d 6 9 11

1 个答案:

答案 0 :(得分:0)

您可以使用Perl:

use strict;
use warnings;

my %hash=();

while (<DATA>) {
    if (m/^([0-9_]+)_(?:[^_]+)_(.*?)\s*$/) {
        push @{ $hash{join(' ', split('_', $1) )} }, $2;    
    }
}

print "$_ ". join(' ', @{ $hash{$_} })."\n" for (keys %hash);

__DATA__
101_1_1_trialOutcome_found_berry            
101_1_1_trialStartTime_2014-08-05 11:26:49.510000           
101_1_1_trialOutcomeTime_2014-08-05 11:27:00.318000         
101_1_1_trialResponseTime_0:00:05.804000            
101_1_1_whichResponse_d         
101_1_1_bearPosition_6          
101_1_1_patch_9         
101_1_1_food_11 

打印:

101 1 1 found_berry 2014-08-05 11:26:49.510000 2014-08-05 11:27:00.318000 0:00:05.804000 d 6 9 11

或者,perl one line version:

$ perl -lane '
> push @{ $hash{join(" ", split("_", $1) )} }, $2 if (m/^([0-9_]+)_(?:[^_]+)_(.*?)\s*$/);
> END { print "$_ ". join(" ", @{ $hash{$_}})."\n" for (keys %hash); }
> ' file.txt
101 1 1 found_berry 2014-08-05 11:26:49.510000 2014-08-05 11:27:00.318000 0:00:05.804000 d 6 9 11