在数组中替换元标记而不是选择器在PHP中进行解析?

时间:2013-11-30 08:23:35

标签: php

我是PHP开发的新手,我想解析元标记的内容,但我不知道如何继续。

我有这段代码可以解析这些元素#squadul[style='height:400px;']选择器的内容。

// Pull in PHP Simple HTML DOM Parser
include("simplehtmldom/simple_html_dom.php");

// Settings on top
$sitesToCheck = array(
                    // id is the page ID for selector
                    array("url" => "http://www.arsenal.com/first-team/players", "selector" => "#squad"),
                    array("url" => "http://www.liverpoolfc.tv/news", "selector" => "ul[style='height:400px;']")
                );
$savePath = "cachedPages/";
$emailContent = "";

// For every page to check...
foreach($sitesToCheck as $site) {
    $url = $site["url"];

    // Calculate the cachedPage name, set oldContent = "";
    $fileName = md5($url);
    $oldContent = "";

    // Get the URL's current page content
    $html = file_get_html($url);

    // Find content by querying with a selector, just like a selector engine!
    foreach($html->find($site["selector"]) as $element) {
        $currentContent = $element->plaintext;;
    }

    // If a cached file exists
    if(file_exists($savePath.$fileName)) {
        // Retrieve the old content
        $oldContent = file_get_contents($savePath.$fileName);
    }

    // If different, notify!
    if($oldContent && $currentContent != $oldContent) {


        // Build simple email content
        $emailContent = "Hey, the following page has changed!\n\n".$url."\n\n";
    }

    // Save new content
    file_put_contents($savePath.$fileName,$currentContent);
}

// Send the email if there's content!
if($emailContent) {
    // Sendmail!
    mail("me@myself.name","Sites Have Changed!",$emailContent,"From: alerts@myself.name","\r\n");
    // Debug
    echo $emailContent;
}

HTML DOM Parser可在此处获取:

http://sourceforge.net/projects/simplehtmldom/files/

但是,我的问题是我想更改此代码以获取元标记中的注释数和费率内容。而且,别忘了,我是全新的新手 !!!

这是元标记,我只是提取注释的数量:

<meta item="desc" content="Comments:645">
<meta item="rates" content="Rates:112">

连续几天我都在撕扯他的头发,但我没有成功! 请帮帮我!

非常重要的精度,我要提取的元标记不在文档的标题中,而是在正文中!

我很解释?如果您需要更多信息,请问我(在我新手的小技能范围内; - ))

0 个答案:

没有答案