Perl:使用LibXML将XML转换为MySQL

时间:2016-05-13 09:31:52

标签: mysql xml perl

我对Perl很陌生,我正在编写一个perl脚本来练习,我想将信息从XML文件解析到MySQL DB,但是我被卡住了,我找不到办法将数据导入MySQL。

这是我的Perl代码:

#!/usr/local/bin/perl
use strict;
use warnings;
use diagnostics;
use XML::LibXML;
use DBI;
my $filename = 'test.xml';
my $dom = XML::LibXML->load_xml(location => $filename);
my $sport_id;
my $sport_name;
my $competition_id;
my $competition_name;
my $game_id;
my $game_start;
my $game_name;
my @values;
my $dbh = DBI->connect("dbi:mysql:parser:127.0.0.1", "root", "123qwe", { RaiseError => 1}) or die $DBI::errstr;
my $query = 'INSERT INTO sports (sport_id,sport_name,competition_id,competition_name,game_id,game_start,game_name) VALUES (?,?,?,?,?,?,?)';
my $sth = $dbh->prepare($query) or die "Prepare failed: " . $dbh->errstr();

foreach my $test ($dom->findnodes('//Sport')) {
    print "\n";
    $sport_id = $test->findvalue('./ID');
    $sport_name = $test->findvalue('./Name');
    $competition_id = $test->findvalue('./Competitions/Competition/ID');
    $competition_name = $test->findvalue('./Competitions/Competition/Name');
    $game_id = $test->findvalue('./Competitions/Competition/Games/ID');
    $game_start = $test->findvalue('./Competitions/Competition/Games/Start');
    $game_name = $test->findvalue('./Competitions/Competition/Games/Name');
    #print "Sport ID: $sport_id\n";
    #print "Sport Name: $sport_name\n";
    #print "Competition ID: $competition_id\n";
    #print "Competition Name: $competition_name\n";
    #print "Game ID: $game_id\n";
    #print "Game Start: $game_start\n";
    #print "Game Name: $game_name\n";
    #print "\n";
    push @values, $sport_id,$sport_name,$competition_id,$competition_name,$game_id,$game_start,$game_name;
    $sth->execute(@values) or die $dbh->errstr;
}

我的XML:

<Sports>
<Sport>
<ID>1369527874</ID>
<Name>Virtual Football</Name>
<Competitions>
<Competition>
<ID>1374380502</ID>
<Name>Virtual Football. World - G.Devs Stadium</Name>
<Games>
<ID>1974885309</ID>
<Start>2016-05-11 12:21:00</Start>
<Name>New England Militia - St. Louis Racers</Name>
<ID>1974892839</ID>
<Start>2016-05-11 12:27:00</Start>
<Name>Las Vegas Rollers - Salt Lake Wrath</Name>
</Games>
</Competition>
</Competitions>
</Sport>
<Sport>
<ID>882</ID>
<Name>Darts</Name>
<Competitions>
<Competition>
<ID>1834852369</ID>
<Name>Darts. World - PDC European Tour Outright</Name>
<Games>
<ID>1895020486</ID>
<Start>2016-05-15 23:00:00</Start>
<Name>PDC European Tour. Outright</Name>
</Games>
</Competition>
</Competitions>
</Sport>
</Sports>

MySQL结构:

+------------------+--------------+------+-----+---------+----------------+
| Field            | Type         | Null | Key | Default | Extra          |
+------------------+--------------+------+-----+---------+----------------+
| id               | int(6)       | NO   | PRI | NULL    | auto_increment |
| sport_id         | varchar(255) | YES  |     | NULL    |                |
| sport_name       | varchar(255) | YES  |     | NULL    |                |
| competition_id   | varchar(255) | YES  |     | NULL    |                |
| competition_name | varchar(255) | YES  |     | NULL    |                |
| game_id          | varchar(255) | YES  |     | NULL    |                |
| game_start       | varchar(255) | YES  |     | NULL    |                |
| game_name        | varchar(255) | YES  |     | NULL    |                |
+------------------+--------------+------+-----+---------+----------------+

如果我从打印行中删除注释,输出将如下:

Sport ID: 1369527874
Sport Name: Virtual Football
Competition ID: 1374380502
Competition Name: Virtual Football. World - G.Devs Stadium
Game ID: 19748853091974892839
Game Start: 2016-05-11 12:21:002016-05-11 12:27:00
Game Name: New England Militia - St. Louis RacersLas Vegas Rollers - Salt Lake Wrath


Sport ID: 882
Sport Name: Darts
Competition ID: 1834852369
Competition Name: Darts. World - PDC European Tour Outright
Game ID: 1895020486
Game Start: 2016-05-15 23:00:00
Game Name: PDC European Tour. Outright

正如你所看到的,主要的问题是我有多个游戏,我似乎找不到分割它们的方法,所以我可以将它们导入mysql。

1 个答案:

答案 0 :(得分:2)

我会重新制定你正在做的事情。看起来你的桌子每个游戏只有一行而不是每个运动一行。

所以你需要一个'内循环'来挑选游戏ID。不幸的是 - 没有分组,所以你需要做一个'下一步'的操作。

使用XML::Twig,因为我对它更熟悉 - 就像这样:

#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;

my $twig = XML::Twig->parse( \*DATA );

foreach my $sport ( $twig->findnodes('//Sport') ) {
    my %fields;
    $fields{sport_id}         = $sport->findvalue('./ID');
    $fields{sport_name}       = $sport->findvalue('./Name');
    $fields{competition_id}   = $sport->findvalue('.//Competition/ID');
    $fields{competition_name} = $sport->findvalue('.//Competition/Name');
    foreach my $game ( $sport->findnodes('.//Games/ID') ) {
        $fields{game_id}    = $game->text;
        $fields{game_start} = $game->next_sibling->text;
        $fields{game_end}   = $game->next_sibling->next_sibling->text;
        print "Fields: ", join(
            ",",
            @fields{
                qw(sport_id sport_name
                    competition_id competition_name
                    game_id game_start game_end)
            }
            ),
            "\n";
    }    
}

(非常确定你可以在XML::LibXML)中做同样的事情