这是我要使用Xpath抓取的html:
<table class="ClassGrid" cellspacing="0" cellpadding="0" border="0" id="_ctl0_phMainContent_dgrdClasses" style="border-collapse:collapse;">
<tbody>
<tr>
<td class="ClassGridRow1" colspan="3">
<hr>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1">Address 123
<br>
<br><a target="_blank" class="gridDirections" href="/Classes/Directions.aspx#104">Directions</a></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">12/12/2018</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl3_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4233&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">1/24/2019</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl4_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4306&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, August 4</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBoxNone"></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, August 18</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl6_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4346&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Thursday, August 30</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl7_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4313&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, September 8</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl8_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4330&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Tuesday, September 18</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl9_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4331&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1" colspan="3">
<hr>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1">Address 0000
<br><a target="_blank" class="gridDirections" href="/Classes/Directions.aspx#190">Directions</a></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, July 21</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl11_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4242&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Tuesday, August 28</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl12_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4243&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Tuesday, September 25</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl13_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4271&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1" colspan="3">
<hr>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1">Address 456
<br><a target="_blank" class="gridDirections" href="/Classes/Directions.aspx#69">Directions</a></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Wednesday, August 1</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl15_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4276&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, August 25</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl16_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4277&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Thursday, September 13</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl17_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4348&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, October 6</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl18_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4278&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Wednesday, October 31</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl19_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4279&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, November 17</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl20_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4280&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1" colspan="3">
<hr>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1">Address 789
<br><a target="_blank" class="gridDirections" href="/Classes/Directions.aspx#223">Directions</a></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, August 4</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl22_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4347&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Saturday, August 18</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl23_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4305&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1">
<div class="ClassGridBox1"></div>
</td>
<td class="ClassGridRow2">
<div class="ClassGridBox2">Thursday, September 20</div>
</td>
<td class="ClassGridRow3">
<div class="ClassGridBox3"><a id="_ctl0_phMainContent_dgrdClasses__ctl24_hplAddToCart" class="whitelight" href="/validate.aspx?ClassID1=4332&ClassID2=0">Book Now</a></div>
</td>
</tr>
<tr>
<td class="ClassGridRow1" colspan="3">
<hr>
</td>
</tr>
</tbody>
并且我尝试返回 ClassGridRow1 , ClassGridRow2 和 ClassGridBox3 的值,如果 ClassGridRow1 包含文本字符串
“地址123”
例如。到目前为止,除了上下文节点的内容之外,我没有其他任何可取的东西。谁能帮忙吗?非常感谢!
答案 0 :(得分:0)
如果您具有所有可用的XPath功能,则可以选择<div class="ClassGridBox1">
节点,并用regex fn:replace
处理text()
:
//tbody/tr/td/div[@class="ClassGridBox1"]/[replace(text(),'(^[a-zA-Z.-]+ [0-9]+).*','$1', 's')]
或者通过一些后期文字处理来放松一下:
//tbody/tr/td/div[@class="ClassGridBox1"]/text()