Question

我的CGI页面花费了更多时间来阅读和处理文本文件中的文本我已经以下面的格式在文本文件中存储了数千条记录。

|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7
|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7
|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7

我在cgi页面中显示上面的文本数据，方法是在“|！|”分隔符的帮助下拆分。我正在使用的代码如下。

use strict;
use CGI;
use File::Slurp;

my $htmls = CGI->new();

my ($recordfile, @content, $tablefields);
$recordfile   = 'Call.txt';
@content      = read_file($recordfile);
$tablefields  = validate_records(\@content);

sub validate_records {
    my @all_con = @{(shift)};
    my $tab_str;
    my $cnts    = 0;
    foreach my $rec_ln (@all_con) {
        $cnts++;
        chomp($rec_ln);
        push my @splitted, split(/ \|\!\| /, $rec_ln);

        my $radioStr = "<input type=\"radio\" name=\"cell\" value=\"$rec_ln\"\/>";

        $tab_str.="<tr>
       <td style=\"text-align\:center\;\">$radioStr</td>                               
       <td>$splitted[1]</td>
       <td>$splitted[2]</td>
       <td>$splitted[3]</td>
       <td>$splitted[4]</td>
       <td>$splitted[5]</td>
       <td>$splitted[6]</td>
       <td>$splitted[7]</td>
   </tr>";

        $tab_str=~s/<td><\/td>/<td>N\/A<\/td>/igs;
   }
   return $tab_str;    
}

print
$htmls->header(),
'<html>
   <head></head>
   <body>
            <table border="1" align="center" width="100%" id="table" style="margin-top:35px;border:0px;" class="TabClass"><thead>
              <tr>
                <th>SELECT</th>                 
                <th>HEADER 2</th>
                <th>HEADER 3</th>
                <th>HEADER 4</th>
                <th>HEADER 5</th>
            <th>HEADER 6</th>
                <th>HEADER 7</th>
              </tr>
           </thead>'.
              $tablefields.
            '</table>

   </body>
   </html>';

只要文件包含更多记录，上面的代码就会花费两分多钟来显示我页面中的所有数据。有没有可能快速阅读和操作文件记录？

请分享您的建议。

Answer 1

首先，提取行

$tab_str =~ s/<td><\/td>/<td>N\/A<\/td>/igs;

超出foreach循环。

Answer 2

为什么你的程序需要很长时间才能运行？让我们检查你的程序是做什么的：

首先，您将文件的内容篡改为@content。然后将值复制到子例程内的@all_con。您现在已经快速连续使用了两倍的文件大小的内存，直到程序结束才会返回。

现在，您循环并拆分文件行，并执行一些连接，最后得到的字符串是原始行的两倍多。然后将所有这些线串在一起，对于每个新添加，您在整个生长线上执行替换以检查空单元格。您现在在内存中拥有原始文件大小的4倍，并且正在对其执行正则表达式替换。

你应该做的是：

删除分隔符|!|，并使用正确的序列化模块，例如Text::CSV。将文件名传递给子例程，并使用while循环解析文件：

my $csv = Text::CSV->new({ binary => 1 });   # using comma delimiter
open my $fh, "<", $file or die "Cannot open $file: $!";
while (my $row = $csv->getline($fh)) {
    print .... ;                             # print directly
}

Text::CSV模块效率很高，csv格式可靠。因为您遍历文件句柄并直接打印，所以不会不必要地将数据存储在内存中。

此外，您可以在连接字符串时直接执行此操作，而不是使用替换来检查空字段：

print start_table(), "<tr>";
for (@$row) {
    my $val = $_;
    if ($val !~ /\S/) {   # contains no non-whitespace
        $val = "N/A";
    }
    print "\t", td($val);
}

如何使用perl快速阅读文本文件？

2 个答案: