我有以下HTML代码:
<tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1">
<td class="table-matches__tt"><span class="table-matches__time" data-live-cell="time">19:00</span><a href="/soccer/germany/oberliga-bremen/oberneuland-habenhauser/COumykPG/" data-live-cell="matchlink"><span>Oberneuland</span> - <span>Habenhauser</span></a></td>
<td class="livebet" data-live-cell="livebet"> </td>
<td class="table-matches__streams" data-live-cell="score">
</td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v"><a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6ev9v&otheroutcomes=2p2k5xv498x0x0,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">1.10</a></td>
<td class="table-matches__odds" data-oid="2p2k5xv498x0x0"><a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv498x0x0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">7.44</a></td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0"><a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6eva0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv498x0x0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">12.40</a></td>
</tr>
我尝试从以下代码中删除3个浮点值:1,10
7.44
12.40
我试图用来设置值的表达式如下:
response.xpath('//a/@target').extract()
我得到的输出是'mySelections'
。
我希望得到旁边的价值。什么是正确的表达方式?
提前谢谢
答案 0 :(得分:1)
response.xpath(&#39; // A /的 @target 强>&#39)。提取物()
如果格式化HTML,则错误很明显。
您想从
text
代码中提取a
,而不是target
属性。
<tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1">
<td class="table-matches__tt">
<span class="table-matches__time" data-live-cell="time">19:00</span>
<a href="/soccer/germany/oberliga-bremen/oberneuland-habenhauser/COumykPG/" data-live-cell="matchlink">
<span>Oberneuland</span> - <span>Habenhauser</span>
</a>
</td>
<td class="livebet" data-live-cell="livebet"> </td>
<td class="table-matches__streams" data-live-cell="score"></td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6ev9v&otheroutcomes=2p2k5xv498x0x0,2p2k5xv464x0x6eva0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">1.10</a>
</td>
<td class="table-matches__odds" data-oid="2p2k5xv498x0x0">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv498x0x0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv464x0x6eva0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">7.44</a>
</td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6eva0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv498x0x0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">12.40</a>
</td>
</tr>
使用以下其中一项
response.xpath('//a/text()').extract()
根据其他开发人员的说法,response.xpath
有时会导致错误,您应该使用scrapy's selector
。
from scrapy.selector import Selector
result_array = Selector(text=response.body).xpath('//a/text()').extract()