如何处理像“&”这样的特殊字符在XML中解析R?

时间:2014-05-01 13:40:06

标签: html xml r

我是R中的XML的新手所以我需要帮助来克服这个问题。

我有以下XML

hp <- htmlParse('<li> <div class="subtle">Culture &amp; Values</div> <span   
class="notranslate gdBars sm " title="5.0"> <span class="gdBarsSep">Â </span><span  
class="gdBarsSep">Â </span><span class="gdBarsSep">Â </span><span class="gdBarsSep">Â  
</span><span class="gdBarsSep last">Â </span> <span style="width:94.5px" class="sel">
</span> </span>
</li> <li> <div class="subtle">Work/Life Balance</div> <span class="notranslate  
gdBars sm " title="4.0"> <span class="gdBarsSep">Â </span><span class="gdBarsSep">Â  
</span><span class="gdBarsSep">Â </span><span class="gdBarsSep">Â </span><span 
class="gdBarsSep last">Â </span> <span style="width:76.5px" class="sel"></span> </span> 
</li>')

在上面的XML中,我试图抓住&#34; title&#34; div具有价值时的价值&#34;文化&amp;值&#34;使用以下R代码,但没有给我预期的输出。我得到Null值作为输出,虽然我期待&#34; 5.0&#34;作为输出。

CultureValues<-unlist(xpathApply(hp,"//div[text()='Culture &amp; Values']/following- 
sibling::span", xmlGetAttr,"title"))

非常感谢提前。

1 个答案:

答案 0 :(得分:0)

您可以使用以下内容:

> xpathSApply(hp, "//*[contains(text(),'Culture ')]/following-sibling::span/@title")
title 
"5.0" 

或将您的查询更改为

> CultureValues<-unlist(xpathApply(hp,"//div[text()='Culture & Values']/following-sibling::span", xmlGetAttr,"title"))
> CultureValues
[1] "5.0"

&amp;只是&

的实体名称