除了其他东西,这段代码..
study.name <- 'NLSY79'
library(XML)
library(httr)
sub.study <- paste0( "https://www.nlsinfo.org/investigator/servlet1?get=SUBSTUDIES&study=" , study.name )
study.html <- GET( sub.study )
content( study.html )
study.block <- htmlParse( study.html , asText = TRUE )
..给了我..
$children$html
<html>
<body>
<p>
false
<select id="thesubstudies" onchange="onSubstudyChanged(this);">
<option value="-1" selected="selected">(Choose One)</option>
<option value="343.06">NLSY79 (1979-2010)</option>
</select>
</p>
</body>
</html>
我只想快速(自动)方式提取“343.06”
谢谢!
答案 0 :(得分:3)
您可以使用xpathSApply
提取所需的元素
xpathSApply(study.block, "//option")
# [[1]]
# <option value="-1" selected="selected">(Choose One)</option>
# [[2]]
# <option value="343.06">NLSY79 (1979-2010)</option>
并对其应用函数(xmlValue
或xmlAttrs
,具体取决于具体情况。)
xpathSApply(study.block, "//option", function(u) xmlAttrs(u)["value"])
# value value
# "-1" "343.06"
答案 1 :(得分:1)
您也可以使用xmlGetAtrr
xpathSApply(study.block, "//option", xmlGetAttr, "value")
[1] "-1" "343.06"
或
xpathSApply(study.block, "//option[not(@selected)]", xmlGetAttr, "value")
[1] "343.06"