如何使用PHP解析网页?

时间:2013-08-06 08:06:34

标签: php html html-parsing

我正在学习php。我学到了一些基础知识。现在我渴望学习网页解析 我想解析这个页面http://www.icc-cricket.com/rankings/team-rankings/test
我想单独解析这个问题 排名团队匹配积分等级
1南非24 3240 135

5 个答案:

答案 0 :(得分:0)

我会推荐Symfony2 The DomCrawler Component http://symfony.com/doc/current/components/dom_crawler.html

答案 1 :(得分:0)

如果您了解基本的PHP,我建议您使用此框架:http://simplehtmldom.sourceforge.net/

它易于使用。

答案 2 :(得分:0)

您可以查看http://simplehtmldom.sourceforge.net/,它可以让您轻松地解析HTML页面。

也就是说,应该总是反过来调查服务是否提供了提要,因为解析它们既不容易出错,也更有效,并且(通常)不会有太大变化。 HTML标记可能会随着时间的推移而发生变化,导致您的dom查询无效。

答案 3 :(得分:0)

似乎这些分数通过ajax附加到页面上。因此,您无法直接解析此链接以获得排名。似乎请求被发送到 http://cma.icc-cricket.com/api/getRankings?callback=onRankings&_1375776810417=

因此,您需要制作类似的请求和处理数据。 来自网址的结果:

onRankings([{"matchType":"TEST","rankings":[{"position":"1","team":{"fullName":"South Africa","abbreviation":"SA"},"qfyMatches":"0","played":"24","points":"3240","rating":"135"},{"position":"2","team":{"fullName":"India","abbreviation":"IND"},"qfyMatches":"0","played":"30","points":"3473","rating":"116"},{"position":"3","team":{"fullName":"England","abbreviation":"ENG"},"qfyMatches":"0","played":"32","points":"3577","rating":"112"},{"position":"4","team":{"fullName":"Australia","abbreviation":"AUS"},"qfyMatches":"0","played":"27","points":"2846","rating":"105"},{"position":"5","team":{"fullName":"Pakistan","abbreviation":"PAK"},"qfyMatches":"0","played":"19","points":"1947","rating":"102"},{"position":"6","team":{"fullName":"West Indies","abbreviation":"WI"},"qfyMatches":"0","played":"22","points":"2168","rating":"99"},{"position":"7","team":{"fullName":"Sri Lanka","abbreviation":"SL"},"qfyMatches":"0","played":"26","points":"2295","rating":"88"},{"position":"8","team":{"fullName":"New Zealand","abbreviation":"NZ"},"qfyMatches":"0","played":"27","points":"2126","rating":"79"},{"position":"9","team":{"fullName":"Bangladesh","abbreviation":"BAN"},"qfyMatches":"0","played":"13","points":"135","rating":"10"}]},{"matchType":"ODI","rankings":[{"position":"1","team":{"fullName":"India","abbreviation":"IND"},"qfyMatches":"0","played":"48","points":"5906","rating":"123"},{"position":"2","team":{"fullName":"Australia","abbreviation":"AUS"},"qfyMatches":"0","played":"34","points":"3861","rating":"114"},{"position":"3","team":{"fullName":"England","abbreviation":"ENG"},"qfyMatches":"0","played":"38","points":"4257","rating":"112"},{"position":"4","team":{"fullName":"Sri Lanka","abbreviation":"SL"},"qfyMatches":"0","played":"49","points":"5435","rating":"111"},{"position":"5","team":{"fullName":"South Africa","abbreviation":"SA"},"qfyMatches":"0","played":"34","points":"3584","rating":"105"},{"position":"6","team":{"fullName":"Pakistan","abbreviation":"PAK"},"qfyMatches":"0","played":"42","points":"4294","rating":"102"},{"position":"7","team":{"fullName":"New Zealand","abbreviation":"NZ"},"qfyMatches":"0","played":"29","points":"2593","rating":"89"},{"position":"8","team":{"fullName":"West Indies","abbreviation":"WI"},"qfyMatches":"0","played":"41","points":"3639","rating":"89"},{"position":"9","team":{"fullName":"Bangladesh","abbreviation":"BAN"},"qfyMatches":"0","played":"23","points":"1754","rating":"76"},{"position":"10","team":{"fullName":"Zimbabwe","abbreviation":"ZIM"},"qfyMatches":"0","played":"23","points":"1205","rating":"52"},{"position":"11","team":{"fullName":"Ireland","abbreviation":"IRE"},"qfyMatches":"0","played":"10","points":"394","rating":"39"},{"position":"12","team":{"fullName":"Netherlands","abbreviation":"NL"},"qfyMatches":"0","played":"7","points":"88","rating":"13"},{"position":"13","team":{"fullName":"Kenya","abbreviation":"KEN"},"qfyMatches":"0","played":"4","points":"40","rating":"10"}]},{"matchType":"T20I","rankings":[{"position":"1","team":{"fullName":"Sri Lanka","abbreviation":"SL"},"qfyMatches":"20","played":"16","points":"2003","rating":"125"},{"position":"2","team":{"fullName":"Pakistan","abbreviation":"PAK"},"qfyMatches":"31","played":"21","points":"2599","rating":"124"},{"position":"3","team":{"fullName":"India","abbreviation":"IND"},"qfyMatches":"18","played":"14","points":"1689","rating":"121"},{"position":"5","team":{"fullName":"South Africa","abbreviation":"SA"},"qfyMatches":"24","played":"18","points":"2158","rating":"120"},{"position":"4","team":{"fullName":"West Indies","abbreviation":"WI"},"qfyMatches":"22","played":"17","points":"2041","rating":"120"},{"position":"6","team":{"fullName":"England","abbreviation":"ENG"},"qfyMatches":"26","played":"19","points":"2148","rating":"113"},{"position":"7","team":{"fullName":"Australia","abbreviation":"AUS"},"qfyMatches":"23","played":"17","points":"1753","rating":"103"},{"position":"8","team":{"fullName":"New Zealand","abbreviation":"NZ"},"qfyMatches":"25","played":"19","points":"1937","rating":"102"},{"position":"unranked","team":{"fullName":"Afghanistan","abbreviation":"AFG"},"qfyMatches":"7","played":"6","points":"525","rating":"88"},{"position":"9","team":{"fullName":"Ireland","abbreviation":"IRE"},"qfyMatches":"12","played":"7","points":"568","rating":"81"},{"position":"10","team":{"fullName":"Bangladesh","abbreviation":"BAN"},"qfyMatches":"14","played":"10","points":"739","rating":"74"},{"position":"11","team":{"fullName":"Scotland","abbreviation":"Sco"},"qfyMatches":"9","played":"7","points":"435","rating":"62"},{"position":"12","team":{"fullName":"Zimbabwe","abbreviation":"ZIM"},"qfyMatches":"14","played":"10","points":"478","rating":"48"},{"position":"13","team":{"fullName":"Netherlands","abbreviation":"NL"},"qfyMatches":"8","played":"5","points":"181","rating":"36"},{"position":"14","team":{"fullName":"Kenya","abbreviation":"KEN"},"qfyMatches":"11","played":"9","points":"309","rating":"34"},{"position":"unranked","team":{"fullName":"Canada","abbreviation":"CAN"},"qfyMatches":"6","played":"4","points":"24","rating":"6"}]}]);

但是如果你想学习HTML解析,那么你也可以使用Ganon

答案 4 :(得分:0)

根据我的观点,它无法解析,因为该表是通过AJAX调用追加的。 我们可以看到这样的空标签:

<section class="standings"></section>

如果我错了,请纠正我

由于