一次提交中增加八千行

时间:2018-12-10 08:16:15

标签: linux windows perl commit

我有一个日志文件,perl脚本接收了该日志文件并抄录了该日志文件,我想一次提交发送所有这行(八千行)

我的脚本:

# Connect to the database.
my $dbh = DBI->connect(
    "DBI:mysql:database=DB;host=>IP",
    "hostname", 'password',
    {'RaiseError' => 1,'AutoCommit'=> 0}
);

    open (FILE, 'file.log');
    while (<FILE>) {

        ($word1, $word2, $word3, $word4, $word5, $word6, $word7, $word8, $word9, $word10, $word11, $word12, $word13, $word14) = split(" ");

        $word13 =~ s/[^\d.]//g;
        if ($word2 eq "Feb") {
                $word2 = "02"  
        }

        print "'$word5-$word2-$word3 $word4', $word11, $word13 \n";

        eval {
            #on peut utiliser insert mais il y aura des doublons et ici on est dans une table unique
            my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES('$word5-$word2-$word3 $word4', $word11, $word13);");

            #print $sth->rows . " rows found.\n";
            #$sth->finish;          

            # do inserts, updates, deletes, queries here
            #$sth->execute() or die "execution failed: $dbh->errstr()";
            $sth->execute() or die "execution failed: $dbh->errstr()";
            $dbh->commit();

        };

        ### If something went wrong...

    }
}

$dbh->disconnect();

谢谢

1 个答案:

答案 0 :(得分:3)

为了获得更好的性能,您想简化代码并尽可能多地将代码移出循环:

  • prepare该语句使用绑定参数在循环之外:该语句始终相同,只有绑定参数才能更改

  • commit处于循环之外:这将提高性能,并具有使过程成为 atomic 的优势。由于所有更改都发生在同一个数据库事务中,因此将处理(并提交)所有行,或者,如果任何行发生故障,将根本不提交任何行。在实施此优化时,您需要注意数据库上的资源使用情况(这通常需要UNDO表空间中的更多空间);如果资源不足,则增加它们或提交每N个记录(N尽可能高)

  • 除非您确实需要,否则请避免在循环中print(我注释了该行)

  • 您正在建立启用了RaiseError属性的连接,但是随后您会忽略在execute处可能发生的错误。如果这确实是您想要的,那么只需禁用语句处理程序上的RaiseError属性,然后删除eval

  • 周围的execute

关于编码实践的其他考虑因素:

  • 总是use strictuse warnings

  • 使用数组存储解析的数据而不是标量列表:可以使您的代码更快,并且使它更易读

    < / li>

代码:

use strict;
use warnings;

# Connect to the database.
my $dbh = DBI->connect(
    "DBI:mysql:database=DB;host=>IP",
    "hostname", 'password',
    {'RaiseError' => 1,'AutoCommit'=> 0}
);

# prepare the insert statement
my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES(?, ?, ?)");
$sth->{RaiseError} = 0;

open (my $file, 'file.log') or die "could not open : $!";
while (<$file>) {
    my @words = split / /;
    $words[12] =~ s/[^\d.]//g;
    if ($words[1] eq "Feb") {
            $words[1] = "02" ;
    }

    # print "'$words[4]-$words[1]-$words[2] $words[3]', $words[10], $words[12] \n";
    $sth->execute( "$words[4]-$words[1]-$words[2] $words[3]", $words[10], $words[12] );

}

$dbh->commit;
$dbh->disconnect;

最后一种解决方案(可能比该解决方案执行得更快)是使用DBI method execute_array执行批量数据库插入。属性ArrayTupleFetch可用于提供DBI在每次准备执行下一个INSERT时将调用的代码引用:该代码引用应读取下一个文件行,并提供适合以下值的数组引用INSERT。文件用完后,该子项应返回undef,这将指示DBI批量处理已完成。

代码:

#!/usr/local/bin/perl

use strict;
use warnings;
use DBI;

# open the file
open (my $file, 'log.file') or die "could not open : $!";

# connect the database
my $dbh = DBI->connect("DBI:mysql:database=DB;host=ip", "hostname", 'password', {'RaiseError' => 1,'AutoCommit'=> 0});

# prepare the INSERT statement
my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES(?, ?, ?)");

# run bulk INSERTS
my $tuples = $sth->execute_array({ 
    ArrayTupleStatus => \my @tuple_status,
    ArrayTupleFetch => sub {
        my $line = <$file>;
        return unless $line;
        my @words = split / /;
        # ... do anything you like with the array, then ...
        return [ "$words[4]-$words[1]-$words[2] $words[3]", $words[10], $words[12] ];
    }
});

if ($tuples) {
    print "Successfully inserted $tuples records\n";
} else {
    # do something usefull with @tuple_status, that contains the detailed results
}

$dbh->commit;
$dbh->disconnect;