使用Perl匹配MySQL表

时间:2009-07-29 08:47:19

标签: sql perl

我有2个MYSQL表即可。 main_table和query1。 main_table包含列position和chr,而query1包含position,chr和symbol。表query1是通过查询main_table导出的。我想要使​​用Perl匹配这两个表,使得输出将具有来自第一列中的main_table和第二列的整个位置列表将是对应于该位置的符号。每个位置根本没有符号或只有一个符号或多个符号。

我不太确定如何为此编写代码,目前我有

#!/usr/bin/perl

use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');
my $main = $dbh->prepare("select chr, position from main_table");        
my $q1   = $dbs->prepare("select position, symbol, chrom from query1");

$main->execute();
$q1->execute();    

while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

以下是main_table和query1的示例。期望的输出是我期待的,我使用excel中的VLOOKUP函数进行了解决。

main_table              
CHR Position        
chr1    229830537       
chr1    229723373           
chr1    229723385           
chr1    229723393           
chr1    229723420           
chr1    229829627       
chr1    229723430           
chr1    229829926       
chr1    229723483           
chr1    229723490           
chr1    229723499           
chr1    229723501           
chr1    229830343       
chr1    229723534           
chr1    229723540           
chr1    230039934       
chr1    229723576           
chr1    229830537       
chr1    229830469           
chr1    229725982           
chr1    229726209       
chr1    229966154       
chr1    229726439           
chr1    229726726           
chr1    229726755           
chr1    229726973       
chr1    229967564       
chr1    229727249           
chr1    229727408           
chr1    229727612           
chr1    229728018           
chr1    229728050           
chr1    229728435                           
chr1    229728513                           
chr1    229966327                           

Query1              
symbol  CHR Position        
C1  chr1    229829230       
C1  chr1    229829278           
C1  chr1    229829442       
C1  chr1    229829627       
C1  chr1    229829653       
C1  chr1    229829683       
C1  chr1    229829810           
C1  chr1    229829926       
C1  chr1    229829961           
C1  chr1    229830085           
C1  chr1    229830086           
C1  chr1    229830087           
C1  chr1    229830088       
C1  chr1    229830141           
C1  chr1    229830343       
C1  chr1    229830469       
C1  chr1    229830534       
C1  chr1    229830537       
C2  chr1    230039932       
C2  chr1    230039934           
C2  chr1    230039939       
C2  chr1    230039944       
457 chr1    229966154           
457 chr1    229966327       
457 chr1    229966500           
457 chr1    229966552           
457 chr1    229966748       
457 chr1    229966998           
457 chr1    229967327           
457 chr1    229967564           
457 chr1    229967594           
457 chr1    229829627       



Desired Output          
Position    symbol      
229830537   C1      
229723373           
229723385           
229723393           
229723420           
229829627   C1, 457     
229723430           
229829926   C1      
229723483           
229723490           
229723499           
229723501           
229830343   C1      
229723534           
229723540           
230039934   c2      
229723576           
229830537   C1      
229830469           
229725982           
229726209           
229966154   457     
229726439           
229726726           
229726755           
229726973           
229967564   457     
229727249           
229727408           
229727612           
229728018           
229728050           
229728435           
229728513           
229966327           

提前致谢

卡伦

4 个答案:

答案 0 :(得分:1)

听起来您需要在SQL查询中执行join操作,但为了使其正常工作,您需要某种关系。您可以使用MySQL reference manual's section on JOIN syntax找出所需内容。

在Perl端,您需要为输出编写逻辑。我建议使用“位置”作为键,然后使用任何符号作为值来制作哈希值。首先填充哈希值,然后输出。它可以简化您按照自己的方式输出查询的过程。

答案 1 :(得分:0)

如果您已经掌握了所有数据并且只是想知道如何在列中输出它,那么您应该查看sprintfprintf,它们允许您格式化输出字符串。

答案 2 :(得分:0)

use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');

my $main = $dbh->prepare("select chr, position from main_table");
$main->execute();

my $q1 = $dbs->prepare("select position, symbol, chrom from query1");
$q1->execute();


while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

=============================================== ======================================

上面的代码只列出了位置和符号,但不匹配。我似乎无法理解如何匹配它们。任何建议。

感谢。 蒈

答案 3 :(得分:0)

Weegee有正确的答案,您可以指定表格的位置:ipaddress.database.table。如果您在同一台计算机上,则可以删除ipaddress部分,如果您在同一个数据库中,则可以删除数据库部分。所以你的代码应该看起来像:

#!/usr/bin/perl

use strict;
use warnings;

use DBI;

my $dbh = DBI->connect(
    'DBI:mysql:disc1pathway',
    "home",
    "home",
    {
        ChopBlanks       => 1,
        AutoCommit       => 1,
        PrintError       => 0,
        RaiseError       => 1,
        FetchHashKeyName => 'NAME_lc',
    }
) or die "could not connect to database: ", DBI->errstr;

my $sth = $dbh->prepare("
    SELECT
        disc1pathway.main_table.chr,
        disc1pathway.main.position,
        results.query1.symbol,
        results.query1.chrom
    FROM disc1pathway.main_table, results.query1
    JOIN results.query1 ON (
        disc1pathway.main_table.position = results.query1.position
    )
");        

$sth->execute;

while (my $col = $sth->fetchrow_hashref) {
    print join(" ", @{$col}{qw/chr position symbol chrom/}), "\n";        
}

$sth->finish;

$dbh->disconnect;