编辑:根据Parfait的建议,我通过指定ISO-8859-1
编码而不是UTF_8
找到了成功。
我正在阅读IEEE文章元数据&摘要。
我正在遍历多个结果页面。我的代码一直运行良好,但是这一点导致了以下错误:
require(XML)
link <- "http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?py=1934&hc=100&rs=1"
doc <- xmlParse(link, encoding = "UTF_8", options = NOCDATA)
错误:
input conversion failed due to input error, bytes 0x20 0x62 0x65 0x66
encoder errorCData section not finished
Discussion on ¿The measurement of noise, with s
Premature end of data in tag title line 3081
Premature end of data in tag document line 3077
Premature end of data in tag root line 3
Error: 1: input conversion failed due to input error, bytes 0x20 0x62 0x65 0x66
2: encoder error3: CData section not finished
Discussion on ¿The measurement of noise, with s
4: Premature end of data in tag title line 3081
5: Premature end of data in tag document line 3077
6: Premature end of data in tag root line 3
我遇到了与此数据集相同的错误,但是通过一次读取较小的数据集(现在hc = 100而不是hc = 1000)成功解析了它。
此处列出了网关查询参数: http://ieeexplore.ieee.org/gateway/
为什么会出现这种错误以及我可以做些什么来解决它?
会话信息:
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] plyr_1.8.3 XML_3.98-1.3
loaded via a namespace (and not attached):
[1] slidify_0.4.5 markdown_0.7.7 tools_3.2.1 whisker_0.3-2 yaml_2.1.13 Rcpp_0.12.1
[7] knitr_1.11 stringr_1.0.0
感谢您的帮助!