简单的html dom中出错

时间:2015-09-01 13:04:15

标签: php simple-html-dom

当我在WampServer中启动代码时出现此错误:


Warning: file_get_contents(http://www.fragrantica.com/designers/A-Perfume-
Organic.html): failed to open stream: A connection attempt failed because the 
connected party did not properly respond after a period of time, or established 
connection failed because connected host has failed to respond. in 
F:\wamp\www\atr\fragantica\simple_html_dom.php on line 76

<小时/> 我以前测试过一些东西,它适用于我的3个URL,但之后给我这个错误!。我使用我的wamp本地服务器和最新版本的simple_html_dom版本。我的代码有点复杂但可以阅读! ...

function connect($furl,$fsname){
$fup=fopen("$furl","r"); // open file of urls for read 
$fname=fopen("$fsname","r"); // open file of file's names for set name
$i=0;
while(!feof($fup)){
    $url=trim(fgets($fup)); // read lines from furl file
    $name=trim(fgets($fname));
    $fdoc=fopen("$name.txt","w"); // make a new file for put contents in it
    $html=file_get_html("$url"); // read contents from favurites html page
    foreach($html->find("div.perfumeslist p") as $tag){
        foreach($tag->find("a") as $alink){
            $perlink="http://www.fragrantica.com".$alink->href;
            fwrite($fdoc,"##PERFUME_LINK:##".$perlink."\n"."\n");
        }
        foreach($tag->find("img") as $im){
            fwrite($fdoc,"##THUMB_SRC:##".$im->src."\n");
        }
        foreach($tag->find("span.mtext") as $sp){
            fwrite($fdoc,"##SEX:##".$sp->innertext."\n");
        }
        $perfume=file_get_html("$perlink");
        foreach($perfume->find("div") as $disc){
            if(strcmp($disc->itemprop,"description")===0){
                fwrite($fdoc,"##DESCRIPTION:##".$disc->innertext."\n");
            }
        }
        foreach($perfume->find("div#mainpicbox img") as $per){
            $pic=$per->src;
            fwrite($fdoc,"##MAINPICURL:##".$pic."\n");
        }
        foreach($html->find("div") as $tag){
            if(strcmp($tag->style,"width: 230px; float: left; text-align: center; clear: left;")===0){
                foreach($tag->find("p") as $notes){
                    fwrite($fdoc,"##NOTES:##".$notes->innertext."\n"."\n");
                }
            }

        }


    fwrite($fdoc,"___________________________________________________________________"."\n");
}
        fclose($fdoc);
    }
    fclose($fup);
    fclose($fname);
}

关于我的代码:
在这个函数中,我读了两个文件:一个用于我的文本文件的名称,另一个用于URL文件。首先从这些文件中逐行读取,然后直到逐到文件结束然后使用file_get_html并获取标签和属性,以获得它的链接源和内部文本......

1 个答案:

答案 0 :(得分:1)

$ch = curl_init();
$timeout = 20;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);

$fileContents = curl_exec($ch);
curl_close($ch);

// Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($fileContents);

试试这个,我没有执行这个,所以我不确定