如何在网页的所有页面中刮取所有div

时间:2016-04-16 10:25:10

标签: php html web-scraping

我需要抓取的网站在所有页面中都有一个包含相同行的表。 使用我在网上找到的代码,我只能抓第一行。

$sql = "SELECT orderID, orderDate, shippingDate, staffName FROM purchase
INNER JOIN staff ON purchase.staffID =
staff.staffID ORDER BY orderDate
WHERE staffID = '".$name."' ";

我的代码:

<table class="players">
            <tr><th><a href="./?sort=pos&amp;order=a" rel="nofollow">Position</a></th><th><a href="./?sort=name&amp;order=a" rel="nofollow">Player Name</a></th><th><a href="./?sort=club_team&amp;order=a" rel="nofollow">Team Name</a></th><th><a href="./?sort=nationality&amp;order=a" rel="nofollow">Nationality</a></th><th><a href="./?sort=height" rel="nofollow">Height</a></th><th><a href="./?sort=weight" rel="nofollow">Weight</a></th><th><a href="./?sort=age&amp;order=a" rel="nofollow">Age</a></th><th><a href="./?sort=condition" rel="nofollow">Condition</a></th><th class=" selected"><a href="./?order=a" rel="nofollow">Overall Rating</a></th></tr>
            <tr><td class="posFW"><div title="Second Striker">SS</div></td><td class="left"><a href="./?id=7511">L. MESSI</a></td><td class="left"><a href="./?all=1&amp;club_team=%22FC BARCELONA%22&amp;sort=club_number&amp;order=a" rel="nofollow">FC BARCELONA</a></td><td class="left"><a href="./?nationality=%22ARGENTINA%22&amp;sort=national_number&amp;order=a" rel="nofollow">ARGENTINA</a></td><td>170</td><td>72</td><td>28</td><td class="condition"><img src="images/condition2.png" alt="2" /></td><td class="selected c3 lvl30">94</td></tr>

我需要抓住:位置,玩家姓名,年龄和整体,我的代码我只能抓第一个div =&#34; title ...如何刮掉所有页面中的所有行......

1 个答案:

答案 0 :(得分:0)

请使用DOM在xpath中执行此操作。您可以从http://scraping.pro/using-domxpath-for-parsing-page-content-in-php/

获得更多帮助