我正在尝试将用户帖子从http://www.propertyforum.com/forum/读入R,一旦我下载文件,xmlTreeParse函数就会抛出错误。
library(XML)
file1 <- "FY12.xml"
download.file("http://www.propertyforum.com/forum/", file1, method="auto")
doc <- xmlTreeParse(file1, useInternalNode = TRUE)
top <- xmlRoot(doc)
Error: 1: xmlParseEntityRef: no name 2: xmlParseEntityRef: no name 3: error parsing attribute name 4: attributes construct error 5: Couldn't find end of Start Tag scr line 38 6: StartTag: invalid element name 7: EntityRef: expecting ';' 8: EntityRef: expecting ';' 9: Opening and ending tag mismatch: link line 47 and head 10: Specification mandate value for attribute async 11: attributes construct error 12: Couldn't find end of Start Tag script line 151 13: Opening and ending tag mismatch: div line 147 and script 14: xmlParseEntityRef: no name 15: Specification mandate value for attribute itemscope 16: attributes construct error 17: Couldn't find end of Start Tag ol line 3466 18: Opening and ending tag mismatch: div line 3459 and ol 19: Specification mandate value for attribute async 20: attributes construct error 21: Couldn't find end of Start Tag script line 3906 22: Opening and ending tag mismatch: div line 3899 and script 23: Specification mandate value for attribute async 24: attr