Perl机器学习

时间:2016-07-12 07:43:30

标签: perl file file-processing

查找输入数据文件的开始,处理和完成时间

问题描述:从输入文件中,我们需要逐行处理,从那行开始,我们必须添加第7列值,直到小于或等于30,它可以是n个行直到小于或等于30,并且从那个输出行,需要从第3和第4列取最大值,第2列值应与逗号“,”连接,第4列输出应该是第3和第4列的加法列值和剩余输入行应遵循上述条件。

运行以下代码时,它应该返回输出,如下面的预期输出。但是下面的代码不会返回input.txt的最后3行的输出,

请分享解决方案或更好的解决方案,然后再使用此代码。

Perl代码

  #!/usr/bin/perl

open(A,"output.txt")||die "File unable to open";
$k=0;
while(<A>){

    @col = split(/\s+/,$_);
        $k+="$col[6]";
    if($k <= 30)
    {
        push(@val,$_);
    }

     if($k > 30)
     {
        func(@val);
        $k = 0;
        @val=();
        @cl=();
        @BT=();
        @start=();
        @process=();
        $k+="$col[6]";
        push(@val,$_);
     }
}

sub func()
{
    foreach $line (@val){

            @cl = split(/\s+/,$line);
            push(@BT,$cl[1]);
            push(@start,$cl[3]);
            push(@process,$cl[2]);  

            $star=(sort{$a<=>$b}@start)[-1];
            $proc=(sort{$a<=>$b}@process)[-1];
            $Bt=join(",",@BT);
            $complete = $star+$proc;        

        }
            print "Patch1(Job)Time\tStartingTime\tProcessing\tTimeCompletion\n$Bt\t$star\t$proc\t$complete\n";
}

input.txt中

4.40609     5   1   4   7   14  9 
3.14721     36  1   1   10  10  10
1.93361     98  2   1   4   5   7
1.36379     30  3   1   12  13  6
1.31525     83  2   1   7   5   3
0.52453     61  3   5   8   5   9
0.30074     84  20  1   26  13  4
0.22983     17  4   10  15  12  9
0.11886     9   10  5   24  12  8
0.09495     67  30  2   36  7   2
0.08804     24  16  5   28  11  14
0.07005     35  26  6   35  11  9
0.06632     38  11  9   26  14  12
0.06436     50  21  3   36  12  4
0.06268     14  11  5   27  9   7
0.06131     11  19  2   33  8   2
0.05119     10  22  7   33  10  2
0.04004     45  1   4   37  6   6
0.03790     88  11  1   34  8   7
0.03244     86  27  4   45  13  6
0.03171     91  17  5   34  8   9
0.03135     85  9   1   34  7   5
0.03127     25  10  1   33  6   16
0.03055     72  9   1   33  6   9
0.03030     34  8   8   32  13  9
0.02963     75  18  1   40  9   1
0.02919     51  17  10  34  14  3
0.02904     1   24  9   35  8   5
0.02537     21  4   8   31  8   22
0.02205     100 26  2   48  11  8
0.02187     41  9   3   39  12  9
0.01731     3   10  5   39  12  7
0.01698     44  2   14  34  11  8
0.01613     93  17  6   40  10  10
0.01388     20  13  5   41  11  21
0.01243     70  9   12  29  6   7
0.01128     76  9   11  33  8   21
0.01116     60  5   11  30  5   8
0.01113     47  26  7   49  12  5
0.01038     32  9   15  34  14  7
0.01018     46  27  2   50  6   5
0.01000     27  1   9   44  7   3
0.00941     42  11  10  39  12  7
0.00940     6   30  1   55  7   11
0.00802     23  26  6   44  4   15
0.00801     28  5   10  35  6   4
0.00630     55  4   9   44  12  8
0.00556     78  26  1   55  6   23
0.00533     31  21  15  38  6   7
0.00516     13  9   10  41  9   6
0.00515     97  3   4   45  5   9
0.00500     29  4   14  42  14  5
0.00491     54  12  6   43  6   2
0.00478     33  5   6   47  10  8
0.00422     37  14  15  42  13  10
0.00413     79  19  3   53  8   15
0.00408     52  28  2   61  9   3
0.00372     53  12  7   41  4   2
0.00369     16  2   12  40  4   9
0.00361     87  13  7   47  8   7
0.00349     68  1   12  48  6   1
0.00293     64  26  4   63  13  6
0.00273     96  22  10  52  9   10
0.00244     74  19  1   60  9   12
0.00216     40  28  1   65  7   10
0.00187     66  17  13  47  7   2
0.00181     43  22  15  51  10  8
0.00173     82  25  4   67  14  17
0.00154     49  12  9   53  10  4
0.00151     12  14  8   57  13  1
0.00147     62  14  13  43  4   11
0.00143     18  26  2   70  12  12
0.00132     7   24  15  54  9   25
0.00130     69  27  14  54  6   6
0.00126     90  27  1   72  11  9
0.00117     94  24  11  58  8   12
0.00105     99  10  14  45  5   8
0.00095     2   13  4   55  4   6
0.00075     56  7   10  54  7   8
0.00067     48  27  4   66  4   5
0.00056     8   23  13  60  7   6
0.00053     89  17  10  64  12  8
0.00051     92  16  8   67  14  10
0.00034     65  28  6   78  11  8
0.00026     71  12  12  62  8   8
0.00024     22  24  15  64  6   23
0.00022     15  22  12  65  5   8
0.00022     26  16  15  66  13  17
0.00022     77  26  6   72  4   22
0.00020     81  5   14  62  8   16
0.00018     57  17  13  61  4   5
0.00017     80  23  11  74  10  14
0.00016     73  27  13  76  11  9
0.00015     59  16  9   69  6   7
0.00014     19  12  12  66  7   1
0.00014     58  29  10  81  10  6
0.00014     63  26  15  72  8   25
0.00008     95  25  8   77  4   11
0.00007     39  22  13  74  6   11
0.00006     4   22  14  76  7   8

预期输出

Batch1(Job)Time StartingTime    Processing  TimeCompletion
5,36,98 4   2   6
Batch1(Job)Time StartingTime    Processing  TimeCompletion
30,83,61,84 5   20  25
Batch1(Job)Time StartingTime    Processing  TimeCompletion
17,9,67 10  30  40
Batch1(Job)Time StartingTime    Processing  TimeCompletion
24,35   6   26  32
Batch1(Job)Time StartingTime    Processing  TimeCompletion
38,50,14,11,10  9   22  31
Batch1(Job)Time StartingTime    Processing  TimeCompletion
45,88,86,91 5   27  32
Batch1(Job)Time StartingTime    Processing  TimeCompletion
85,25,72    1   10  11
Batch1(Job)Time StartingTime    Processing  TimeCompletion
34,75,51,1  10  24  34
Batch1(Job)Time StartingTime    Processing  TimeCompletion
21,100  8   26  34
Batch1(Job)Time StartingTime    Processing  TimeCompletion
41,3,44 14  10  24
Batch1(Job)Time StartingTime    Processing  TimeCompletion
93  6   17  23
Batch1(Job)Time StartingTime    Processing  TimeCompletion
20,70   12  13  25
Batch1(Job)Time StartingTime    Processing  TimeCompletion
76,60   11  9   20
Batch1(Job)Time StartingTime    Processing  TimeCompletion
47,32,46,27,42  15  27  42
Batch1(Job)Time StartingTime    Processing  TimeCompletion
6,23,28 10  30  40
Batch1(Job)Time StartingTime    Processing  TimeCompletion
55  9   4   13
Batch1(Job)Time StartingTime    Processing  TimeCompletion
78,31   15  26  41
Batch1(Job)Time StartingTime    Processing  TimeCompletion
13,97,29,54,33  14  12  26
Batch1(Job)Time StartingTime    Processing  TimeCompletion
37,79,52,53 15  28  43
Batch1(Job)Time StartingTime    Processing  TimeCompletion
16,87,68,64 12  26  38
Batch1(Job)Time StartingTime    Processing  TimeCompletion
96,74   10  22  32
Batch1(Job)Time StartingTime    Processing  TimeCompletion
40,66,43    15  28  43
Batch1(Job)Time StartingTime    Processing  TimeCompletion
82,49,12    9   25  34
Batch1(Job)Time StartingTime    Processing  TimeCompletion
62,18   13  26  39
Batch1(Job)Time StartingTime    Processing  TimeCompletion
7   15  24  39
Batch1(Job)Time StartingTime    Processing  TimeCompletion
69,90,94    14  27  41
Batch1(Job)Time StartingTime    Processing  TimeCompletion
99,2,56,48  14  27  41
Batch1(Job)Time StartingTime    Processing  TimeCompletion
8,89,92 13  23  36
Batch1(Job)Time StartingTime    Processing  TimeCompletion
65,71   12  28  40
Batch1(Job)Time StartingTime    Processing  TimeCompletion
22  15  24  39
Batch1(Job)Time StartingTime    Processing  TimeCompletion
15,26   15  22  37
Batch1(Job)Time StartingTime    Processing  TimeCompletion
77  6   26  32
Batch1(Job)Time StartingTime    Processing  TimeCompletion
81,57   14  17  31
Batch1(Job)Time StartingTime    Processing  TimeCompletion
80,73,59    13  27  40
Batch1(Job)Time StartingTime    Processing  TimeCompletion
19,58   12  29  41
Batch1(Job)Time StartingTime    Processing  TimeCompletion
63  15  26  41
Batch1(Job)Time StartingTime    Processing  TimeCompletion
95,39,4    25   14    39

1 个答案:

答案 0 :(得分:3)

一些改进建议。

  1. 添加use strictuse warnings。然后解决他们揭示的问题(很大程度上,我怀疑,未声明的变量)。
  2. 使用3-arg open()和词法文件句柄(open my $fh, '<', $file),
  3. $!消息(open()/die())中加入open ... or die "$file: $!"
  4. split(/\s+/, $_)split相同。
  5. $k+="$col[6]"$k += $col[6])中不需要引号。
  6. func()是子程序的可怕名称。
  7. func()使用了很多全局变量。
  8. 您将@val传递给func(),但在子程序中,您使用的是全局副本,而不是参数。
  9. 我没有时间查看您正在使用的算法。