我得到了非常奇怪的结果,我知道他们一定是小事,我做错了。我正在尝试检查并查看postgresql数据库表中是否存在行,并且在第一个循环中我得到实际值。在循环的第二次迭代和所有迭代后,我得到一个undef。为什么?有什么我必须做的,我没有做。我没有使用准备,所以我不应该打电话给完成等。
任何见解都将极大地帮助我调试此问题。
对不起,代码现在非常讨厌。我一直在进行调试,让事情变得非常难看。
同样抱歉丑陋的样本输出。我不知道如何使用stackoverflow很好地格式化它。
请不要在示例输出中输出“选择名称”。第一次返回undef后的所有迭代。 有问题的sql调用是在文件的末尾。这是
行my $selectSQL = "select name from crawler_url where url='http://www.maccosmetics.com$item->{'uri'}' ";
Perl代码:
#!/usr/bin/perl
use LWP::Simple; # From CPAN
use JSON qw( decode_json ); # From CPAN
use JSON::Parse 'parse_json';
use Data::Dumper; # Perl core module
use HTML::TreeBuilder 5 -weak;
use Mojo::DOM;
use DBI;
use String::Util qw(trim);
use strict; # Good practice
use warnings; # Good practice
my $initialize = 0;
my $debug = 1;
&main;
sub main {
my $dbh = connect2db();
unless(defined($dbh)) {
exit 1;
}
my $trendsurl;
my $sth = $dbh->prepare("SELECT company_name from companies where active=1");
$sth->execute;
while( my $company = $sth->fetchrow_hashref() ) {
#print Dumper($company)."\n";
my $sth2 = $dbh->prepare("SELECT url from crawlers where company_name='$$company{'company_name'}' ");
$sth2->execute;
while( my $url = $sth2->fetchrow_hashref() ) {
#print " NOW ON URL $$url{'url'} ##########\n";
$trendsurl = $$url{'url'};
chomp($trendsurl);
$trendsurl = trim($trendsurl);
print "URL: ".$trendsurl."\n";
my $json = get( $trendsurl );
die "Could not get $trendsurl!" unless defined $json;
my $parsed_json = parse_json($json);
my $items = $parsed_json->{'sections'}[0]->{'items'};
foreach my $item_hash (@$items) {
#print Dumper($item_hash)."\n";
my $category = $item_hash->{'name'};
print "Lip Product Category: $category\n";
foreach my $item ( @{ $item_hash->{'items'} } ) {
print Dumper($item)."\n";
my $selectSQL = "select name from crawler_url where url='http://www.maccosmetics.com$item->{'uri'}' ";
print $selectSQL."\n" if($debug);
my ($productCount) = $dbh->selectrow_array($selectSQL);
my $date = localtime;
chomp($productCount);
trim($productCount);
chomp($item->{'name'});
trim($item->{'name'});
print "Select Name: '$productCount'\n";
print "Item Name: '$item->{'name'}'\n";
print "Do they equal: ", index($productCount, $item->{'name'}), " \n";
print Dumper($productCount);
if( index($productCount, $item->{'name'}) == -1 ) {
my $insertSQL = "insert into crawler_url (first_seen,url,name,category,last_checked) values ('$date','http://www.maccosmetics.com$item->{'uri'}','$item->{'name'}','$category','$date') ";
print $insertSQL."\n" if($debug);
my $retVal = $dbh->do($insertSQL);
$insertSQL = "insert into urls (company_name,url) values ('$$company{'company_name'}','http://www.maccosmetics.com$item->{'uri'}') ";
print $insertSQL."\n" if($debug);
$retVal = $dbh->do($insertSQL);
}
else {
#We have seen this before
my $updateSQL = "update crawler_url SET (url,name,category,last_checked) = ('http://www.maccosmetics.com$item->{'uri'}','$item->{'name'}','$category','$date' )";
print $updateSQL."\n" if($debug);
my $retVal = $dbh->do($updateSQL);
}
}
}
}
}
}
sub connect2db {
return DBI->connect("dbi:Pg:dbname=xxxxxx", "xxxxx", "XXXXXX");
}
示例输出:
URL: http://www.maccosmetics.com/includes/panel_nav/catalog.js?CATEGORY_ID=CAT163&LOCALE=en_US
Lip Product Category: Lipstick
$VAR1 = {
'uri' => '/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl',
'description' => 'Colour plus texture for the lips. Stands out on the runway...',
'name' => 'Lipstick',
'thumbnail' => '/images/products/56x56/M300.jpg',
'header' => '/images/pnav/product/headers/pnav_M300_200x12_off.gif',
'id' => 'CAT168PROD310'
};
select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl'
Select Name: 'Lipstick '
Item Name: 'Lipstick'
Do they equal: -1
$VAR1 = 'Lipstick ';
update crawler_url SET (url,name,category,last_checked) = ('http://www.maccosmetics.com/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl','Lipstick','Lipstick','Wed Jan 28 21:15:40 2015' )
$VAR1 = {
'id' => 'CAT168PROD34492',
'thumbnail' => '/images/products/56x56/MX5G8N.jpg',
'header' => '/images/pnav/product/headers/pnav_MX5G8N_200x12_off.gif',
'description' => "Miley Cyrus\x{2019}s shade of VIVA GLAM Lipstick. Her super-sexy hot...",
'name' => 'VIVA GLAM Miley Cyrus Lipstick',
'uri' => '/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl'
};
select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl'
Select Name: ''
Item Name: 'VIVA GLAM Miley Cyrus Lipstick'
Do they equal: 0
$VAR1 = undef;
insert into crawler_url (first_seen,url,name,category,last_checked) values ('Wed Jan 28 21:15:40 2015','http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl','VIVA GLAM Miley Cyrus Lipstick','Lipstick','Wed Jan 28 21:15:40 2015')
insert into urls (company_name,url) values ('MAC ','http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl')
$VAR1 = {
'uri' => '/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl',
'description' => 'Formulated to shade, define and showcase the lips in a rouge-y...',
'name' => 'Isabel and Ruben Toledo Lipstick ',
'header' => '/images/pnav/product/headers/pnav_MWWE1T_200x12_off.gif',
'thumbnail' => '/images/products/56x56/MWWE1T.jpg',
'id' => 'CAT168PROD34798'
};
select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl'
Select Name: ''
Item Name: 'Isabel and Ruben Toledo Lipstick '
Do they equal: 0
$VAR1 = undef;
insert into crawler_url (first_seen,url,name,category,last_checked) values ('Wed Jan 28 21:15:40 2015','http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl','Isabel and Ruben Toledo Lipstick ','Lipstick','Wed Jan 28 21:15:40 2015')
insert into urls (company_name,url) values ('MAC ','http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl')
更新:
当我在next
电话前加$dbh->do
时,我会得到我期望的结果。所以它与执行$dbh->do($insertSQL)
或$dbh->do($updateSQL)
有关。我想在第二次交互之后再次使用$dbh->selectrow_array($selectSQL)
之后再打一次电话吗?如果是这样的话?
答案 0 :(得分:-1)
你真的应该添加$ sth2-> finish();在你的内部while循环结束时,和$ sth-> finish();在你的外部while循环之后。在内循环上不执行完成可能会导致第一次迭代工作,但不会导致所有后续迭代,就像您在问题中描述的那样。
至少可以说,如果你没有嵌套的提取,你通常可以侥幸逃脱。一旦你有没有相应完成的嵌套提取,你就会遇到你描述的确切问题。