网页抓取从xml获取内容

时间:2012-04-25 10:26:41

标签: php

我如何使用php从xml页面获取内容。 内容如下:

 <entry>
   <title>News</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>

 <entry>
   <title>News2</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username2</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>

 <entry>
   <title>News3</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username3</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>

 <entry>
   <title>News4</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username4</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>

我怎样才能获得标题数组,博客链接<link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>,作者详细信息如name和uri(个人资料链接)以及使用php的摘要?

1 个答案:

答案 0 :(得分:1)

查看simplexml,xpath http://php.net/manual/en/book.simplexml.php

   $file = 'url or file name';
    $xml = simplexml_load_file('$file');
    $list= $xml->xpath("/entry"); // root/entry ...
    print $list[0]->id; 
    #var_dump($list);