从xmlNodeSet R中提取数据

时间:2015-07-07 16:14:04

标签: xml r

我有一个XmlNodeSet(XML包,R编程语言),我必须提取数据。这是它的外观:

<drugScores>
            <drug code="3TC" score="95" levelStanford="5" levelSIR="R" />
            <drug code="ABC" score="110" levelStanford="5" levelSIR="R" />
            <drug code="ATV/r" score="90" levelStanford="5" levelSIR="R" />
            <drug code="AZT" score="140" levelStanford="5" levelSIR="R" />
            <drug code="D4T" score="150" levelStanford="5" levelSIR="R" />
            <drug code="DDI" score="135" levelStanford="5" levelSIR="R" />
            <drug code="DRV/r" score="0" levelStanford="1" levelSIR="S" />
            <drug code="EFV" score="100" levelStanford="5" levelSIR="R" />
            <drug code="ETR" score="40" levelStanford="4" levelSIR="I" />
            <drug code="FPV/r" score="55" levelStanford="4" levelSIR="I" />
            <drug code="FTC" score="95" levelStanford="5" levelSIR="R" />
            <drug code="IDV/r" score="85" levelStanford="5" levelSIR="R" />
            <drug code="LPV/r" score="75" levelStanford="5" levelSIR="R" />
            <drug code="NFV" score="120" levelStanford="5" levelSIR="R" />
            <drug code="NVP" score="150" levelStanford="5" levelSIR="R" />
            <drug code="RPV" score="45" levelStanford="4" levelSIR="I" />
            <drug code="SQV/r" score="105" levelStanford="5" levelSIR="R" />
            <drug code="TDF" score="85" levelStanford="5" levelSIR="R" />
            <drug code="TPV/r" score="35" levelStanford="4" levelSIR="I" />
        </drugScores>

我必须做矩阵,dataFrame或sth,其中包含药物代码,分数,levelStanford,levelSIR。你能救我吗?

1 个答案:

答案 0 :(得分:-1)

library(rvest)
kk<-read_html("yourxml.xml") %>%
html_nodes("drug") 
ss<-do.call(rbind,(lapply(c("code","score","levelStanford","levelSIR"),function(i)data.frame(cbind(type=i,values=html_attr(kk,i))))))
head(ss)
  type values
1 code    3TC
2 code    ABC
3 code  ATV/r
4 code    AZT
5 code    D4T
6 code    DDI