Xpath error with object of class XMLDocumentContent

时间:2015-05-04 19:47:35

标签: xml r xpath rcurl

I just got started with the RCurl package and after reading about the basics of xpath I tried to solve a simple example. What I'd like to do is to retrieve all the tables from a local sports site. I used the following code:

# used packages
library(Rcurl)
library(XML)

# retrieve
test <- getURL(url = "http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League")

# parse
test <- htmlTreeParse(test)

# select tables
tabel <- xpathApply(test, "//table", xmlValue)

When I run this piece of code I always encounter the following error:

Error in UseMethod("xpathApply") : 
no applicable method for 'xpathApply' applied to an object of class "XMLDocumentContent"

I seem to be missing something very basic but I can't seem to see what exactly.

2 个答案:

答案 0 :(得分:2)

xpathApply does not work on XMLDocumentContent objects but does with on XMLNodes. You can extract the root node of your xml document with xmlRoot and then make xpath queries against that object

table <- xpathApply(xmlRoot(test), "//table", xmlValue)

答案 1 :(得分:1)

解决方案1 ​​

library(rvest)
all_table1<-"http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League" %>%
            html%>%
            html_table()

解决方案2

library(XML)
all_table2<-readHTMLTable(htmlParse("http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League"))