Question

我做PHP脚本，脚本必须复制出版物列表（从主页）并复制这些出版物中的信息。

我需要复制上一个网站的内容并将内容添加到新网站！

我取得了一些成功，我的PHP脚本会复制主页上的出版物列表。我需要制作一个脚本，在每个出版物中提取信息（标题，照片，全文）！

为此，我写了一个函数，提取每个帖子的链接。帮我写一个能复制给定链接信息的函数！

    <?php
header('Content-type: text/html; charset=utf-8');
require 'phpQuery.php';

function print_arr($arr){
    echo '<pre>' . print_r($arr, true) . '</pre>';
}

$url = 'http://goruzont.blogspot.com/';
$file = file_get_contents($url);

$doc = phpQuery::newDocument($file);

foreach($doc->find('.blog-posts .post-outer .post') as $article){
    $article = pq($article);
    $text = $article->find('.entry-title a')->html();
    print_arr($text);
    $texturl = $article->find('.entry-title a')->attr('href');
    echo $texturl;
    $text = $article->find('.date-header')->html();
    print_arr($text);
    $img = $article->find('.thumb a')->attr('style');

     $img."<br>"; if (preg_match('!background:url.(.+). no!',$img,$match)) { 
$imgurl = $match[1]; 
} else 

{echo "<img src = http://goruzont.blogspot.com".$item.">";}

        echo "<img src='$imgurl'>";
}
?>

来自PHP站点的Web抓取信息

0 个答案: