I just got started with the RCurl package and after reading about the basics of xpath I tried to solve a simple example. What I'd like to do is to retrieve all the tables from a local sports site. I used the following code:
# used packages
library(Rcurl)
library(XML)
# retrieve
test <- getURL(url = "http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League")
# parse
test <- htmlTreeParse(test)
# select tables
tabel <- xpathApply(test, "//table", xmlValue)
When I run this piece of code I always encounter the following error:
Error in UseMethod("xpathApply") :
no applicable method for 'xpathApply' applied to an object of class "XMLDocumentContent"
I seem to be missing something very basic but I can't seem to see what exactly.
答案 0 :(得分:2)
xpathApply
does not work on XMLDocumentContent objects but does with on XMLNodes. You can extract the root node of your xml document with xmlRoot
and then make xpath queries against that object
table <- xpathApply(xmlRoot(test), "//table", xmlValue)
答案 1 :(得分:1)
library(rvest)
all_table1<-"http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League" %>%
html%>%
html_table()
library(XML)
all_table2<-readHTMLTable(htmlParse("http://sporza.be/cm/sporza/voetbal/Jupiler_Pro_League"))