XPath用于选择所有兄弟姐妹直到div

时间:2017-04-09 17:26:27

标签: python html xpath

我有以下HTML结构。每个<div>代表一个州,每个<a>代码代表该州内的城市

<div class="country">AL</div>
<a href=somelink>City1</a>
<a href=somelink>City2</a>
<a href=somelink>City3</a>
<a href=somelink>City4</a>
<a href=somelink>City5</a>
<div class="country">CA</div>
<a href=somelink>City21</a>
<a href=somelink>City22</a>
<a href=somelink>City23</a>
<a href=somelink>City24</a>
<div class="country">IL</div>
<a href=somelink>City31</a>
<a href=somelink>City32</a>
<a href=somelink>City33</a>
<a href=somelink>City34</a>

我需要提取所有属于某种状态的标签 我试过这个:

//*[contains(text(), "CA")]/following-sibling::a[preceding::div]

但它让我

  

City21 City22 City23 City24 City31 City32 City33 City34

虽然我只想要

  

City21 City22 City23 City24

2 个答案:

答案 0 :(得分:2)

请尝试以下XPAth表达式:

//a[count(following-sibling::div)=count(//div[text()="AL"]/following-sibling::div)-1]

//a[preceding-sibling::div[2][text()="AL"]]

答案 1 :(得分:1)

您可以选择标记a,以便之前的第一个div包含州缩写:

//a[preceding::div[1][contains(text(), "CA")]]