PHP在本地刮擦和存储输出

时间:2014-02-28 15:56:05

标签: php html storage scrape

我有一个小的PHP脚本从原子提取中提取信息。然后将回声推送到文件或数据库的最佳方法是什么?我希望能够每小时参考这些信息。

例如。

PHP grabs HTML page.
Webserver stores the resulting HTML locally.
Webserver references that locally stored HTML.
Webserver displays a template version of the local information.

这是php拉动html:

<?php include('simple_html_dom.php'); echo file_get_html('http://www.website.com')->plaintext[1] ; ?>

2 个答案:

答案 0 :(得分:0)

你可以使用非常简单的file_put_contents()http://php.net/file_put_contents

file_put_contents("myFile", file_get_html('http://www.website.com')->plaintext[1]);

答案 1 :(得分:0)

希望这会有所帮助......

<html>
<head>
<title>Online PHP Script Execution</title>
</head>
<body>
<?php require'simple_html_dom.php'; ?>
<?php
    $url_path='http://www.google.com/';
    $create=fopen("folder/"."index.html",'w+') or die("can't open file");
    $src=file_get_html($url_path)->plaintext ;
    $write=fwrite($create,$src);
        fclose($create);

    foreach($html->find('script') as $script)   //get script 
            {

                    $scriptPath=$script->src;
                    $js = explode("/", $scriptPath);
                    $jsName = end($js);
             $path="";
                  for ($i=0;$i<(count($image)-1);$i++) 
                      {
                        $path .= $image[$i] . "/";
                        if(!file_exists('folder/'.$path))
                            {
                                mkdir('folder/'.$path, 0777, true);     //jquery folder created
                            }       
                      }
                    file_put_contents('folder/'.$path.$jsName,file_get_contents($url_path.$scriptPath)); //all jquery downloaded
            }

        foreach($html->find('img') as $img)     //image
                {

                    $imgpath=$img->src;
                    $image = explode("/", $imgpath);
          $path="";
                      for ($i=0;$i<(count($image)-1);$i++) 
                      {
                        $path .= $image[$i] . "/";
                        if(!file_exists('folder/'.$path))
                            {
                                mkdir('folder/'.$path, 0777, true);     //img folder created
                            }       
                      }

                    $imgName = end($image);
                        file_put_contents('folder/'.$path.$imgName,file_get_contents($url_path.$imgpath));  //img downloaded

                } 

        foreach($html->find('link') as $link)  //get link 
                {
                        if(strtolower($link->getAttribute('rel')) == "stylesheet" ) 
                        {
                            $linkpath=$link->getAttribute('href');
                        }

                        $links = explode("/", $linkpath);
                        $linkName = end($links);
                      $path="";
                  for ($i=0;$i<(count($image)-1);$i++) 
                      {
                        $path .= $image[$i] . "/";
                        if(!file_exists('folder/'.$path))
                            {
                                mkdir('folder/'.$path, 0777, true);     //css folder created
                            }       
                      }

                            file_put_contents('folder/'.$path.$linkName,file_get_contents($url_path.$linkpath));    //download css
                        }
?>
</body>
</html>