PHP - RSS解析器XML

时间:2017-04-16 22:30:31

标签: php xml rss

问题:如何从XML解析<media:content URL="IMG" />

行。这就像问为什么1 + 1 = 2.而2 + 2 =不可用。

原始链接: 如何使用SimpleXML和PHP解析XML //作者:John Morris。 https://www.youtube.com/watch?v=_1F1Iq1IIS8

使用他的方法,我可以轻松地在RSS FEED纽约时报上找到项目 使用以下代码:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>How to Parse XML with SimpleXML and PHP</title>
</head>
<body>
<?php
$url = 'http://rss.nytimes.com/services/xml/rss/nyt/Sports.xml';
$xml = simplexml_load_file($url) or die("Can't connect to URL");

?><pre><?php //print_r($xml); ?></pre><?php

foreach ($xml->channel->item as $item) {
    printf('<li><a href="%s">%s</a></li>', $item->link, $item->title);
}
?>  
</body>
</html>

得到:
Sparky Lyle在纪念公园?粉丝说是的,但他不同意 N.B.A背后的浓重美国人在法国 在职业篮球比赛中:“匆匆得分”:马刺传球更多的季后赛痛苦 ...

BUT

到达媒体:内容你不能使用simplexml_load_file,因为它不会获取任何media.content标记。

所以......是的..我在Webb上搜索过。 我在StackOverflow上找到了这个例子:
get media:description and media:content url from xml

但使用代码:

<?php
function feeds() 
{
    $url = "http://rss.nytimes.com/services/xml/rss/nyt/Sports.xml"; // xmld.xml contains above data
    $feeds = file_get_contents($url);
    $rss = simplexml_load_string($feeds);
    foreach($rss->channel->item as $entry) {
         if($entry->children('media', true)->content->attributes()) {
                $md = $entry->children('media', true)->content->attributes();
                print_r("$md->url");
            }
    }
}
?>

给我没有错误。但也是一个空白页面。

似乎大多数人(谷歌搜索)几乎不知道如何真正使用媒体:内容。所以我必须转向Stackoverflow并希望有人能提供答案。我甚至不愿意使用SimpleXML。

我想要的是...抓取媒体:内容网址图片并在外部网站上使用它们。

另外..如果可能的话。
我想将XML解析的项目放入SQL数据库中。

1 个答案:

答案 0 :(得分:1)

我想出了这个:

<?php
$url = "http://rss.nytimes.com/services/xml/rss/nyt/Sports.xml"; // xmld.xml contains above data
$feeds = file_get_contents($url);
$rss = simplexml_load_string($feeds);

$items = [];

foreach($rss->channel->item as $entry) {
    $image = '';
    $image = 'N/A';
    $description = 'N/A';
    foreach ($entry->children('media', true) as $k => $v) {
        $attributes = $v->attributes();

        if ($k == 'content') {
            if (property_exists($attributes, 'url')) {
                $image = $attributes->url;
            }
        }
        if ($k == 'description') {
            $description = $v;
        }
    }

    $items[] = [
        'link' => $entry->link,
        'title' => $entry->title,
        'image' => $image,
        'description' => $description,
    ];
}

print_r($items);
?>

,并提供:

Array
(
    [0] => Array
        (
            [link] => SimpleXMLElement Object
                (
                    [0] => https://www.nytimes.com/2017/04/17/sports/basketball/a-court-used-for-playing-hoops-since-1893-where-paris.html?partner=rss&emc=rss
                )

            [title] => SimpleXMLElement Object
                (
                    [0] => A Court Used for Playing Hoops Since 1893. Where? Paris.
                )

            [image] => SimpleXMLElement Object
                (
                    [0] => https://static01.nyt.com/images/2017/04/05/sports/basketball/05oldcourt10/05oldcourt10-moth-v13.jpg
                )

            [description] => SimpleXMLElement Object
                (
                    [0] => The Y.M.C.A. in Paris says its basketball court, with its herringbone pattern and loose slats, is the oldest one in the world. It has been continuously functional since the building opened in 1893.
                )

        )
.....

你可以迭代

foreach ($items as $item) {
    printf('<img src="%s">', $item['image']);
    printf('<a href="%s">%s</a>', $item['url'], $item['title']);
}

希望这有帮助。