我正在实现一个代码,用于将其中一个站点的已删除值插入到数据库中,但它会在数据库中插入两次。经过太多的分析后,我仍然无法弄清楚为什么它会被插入数据库两次:
我的代码如下:
include('simple_html_dom.php');
$aflink3 = "http://aliveforfootball.com/blog/david-moyes-confident-manchester-united- future/";
$linkurl = $aflink3;
// Loading the url
$html = file_get_html($linkurl);
// an array state to find the html elements for scraping the data
$States = array
(
array("state","div.entry-content",""),
array("article.post",1,1)
);
// Finding the title of the article
if(($html->find("meta[property='og:title']",0))!=null){ $metatitle = $html->find ("meta[property ='og:title']",0)->content;}
$title = $metatitle;
// Foreach to find the meta property of type images.
$metaimages = array();
if(($html->find("meta[property='og:image']"))!=null){
foreach($html->find("meta[property='og:image']") as $metaimage){
$item['image'] = $metaimage->content;
$metaimages = $item;
}
}else {}
// Function to find the paragraphs of a particular article
function findParagraphs($article){
global $subtitle1;
global $articlecontent;
global $content;
global $spancontent;
$spancontent = array();
$content = array();
$articlecontent = array();
foreach($article->find('p') as $p){
$articlecontent[] = $p->plaintext;
}
foreach($article->find('p span') as $spandiv){
$spancontent[] = $spandiv->plaintext;
}
$articlelength = count($articlecontent);
$spanlength = count($spancontent);
for($i=0;$i<$articlelength;$i++){
for($j=0;$j<$spanlength;$j++){
if(strpos($articlecontent[$i],(substr($spancontent[$j],0,5))) === false){
}else{ $articlecontent[$i] = ""; }
}
}
$content = $articlecontent;
}
$flag = 0;
$article = null;
$state = 0;
// Function to match the html elements to construct the data for the article section
$rows = count($States);
for($row = 0; $row < 2; $row++) {
for($col = 0; $col < 3; $col++ ) {
echo "[".$row."][".$col."]<BR>";
if($States[$row][$col] == 1){
$statefound = $States[$row][0]." ".$States[0][$col];
$article = $html->find($statefound,0);
if(isset($article) && ($state == 0)){
$state = 1;
findParagraphs($article);
break 2;
}
}
}
}
// Creating the JSON Object of the scraped data
$stuff = array(
'title' => $title ,
'image' => $metaimages,
'content' => $content );
//Function to insert the Scraped-data into the database
if($stuff != null){
global $linkurl;
$jsencode = json_encode($stuff);
$obj = json_decode($jsencode, TRUE);
$dbcontent = "";
for($i=0; $i<count($obj['content']); $i++) {
$dbcontent .= "<p>".$obj['content'][$i]."</p>";
}
$dbtitle = "";
for($i=0; $i<count($obj['title']); $i++) {
$dbtitle .= "<p>".$obj['title']."</p>";
}
$dbimage = "";
for($i=0; $i<count($obj['image']); $i++) {
$dbimage = "<p>".$obj['image']['image']."</p>";
}
//Intializing the MySql Connections
mysql_connect("localhost", "root", "password") or die(mysql_error());
mysql_select_db("Parsing") or die(mysql_error());
mysql_query("INSERT INTO Sportparse
(linkurl,linktitle,linkimage,linkcontent) VALUES('$linkurl','$dbtitle','$dbimage','$dbcontent') ")
or die(mysql_error());
echo "Data Inserted Successfully";
//Cleaning up the memory to prevent the memory Leak
$html->clear();
unset($html);
}
?>
在插入数据库的最后一个代码中,数据被插入两次而不是仅插入一次。 我已经尝试了所有的东西,但我无法修复它,我认为它值得研究这个问题,因为我的许多同事都无法弄明白。