使用R下载pdf文件的问题

时间:2012-02-14 16:12:46

标签: r pdf

我想从互联网上下载pdf文件并将其保存在本地HD中。下载后,pdf输出文件有很多空页。我该怎么做才能解决它?

示例:

require(XML)
url <- ('http://cran.r-project.org/doc/manuals/R-intro.pdf')
download.file(url, 'introductionToR.pdf')

提前致谢。

2 个答案:

答案 0 :(得分:29)

尝试使用wb-mode这样:

download.file(url, 'introductionToR.pdf', mode="wb")

对我而言,它就是这样的。

答案 1 :(得分:-1)

您可以使用tabulizer包下载pdfs并将表导出为data.frame

https://ropensci.org/tutorials/tabulizer_tutorial.html

install.packages("devtools")
# on 64-bit Windows
ghit::install_github(c("ropenscilabs/tabulizerjars", "ropenscilabs/tabulizer"), INSTALL_opts = "--no-multiarch")
# elsewhere
ghit::install_github(c("ropenscilabs/tabulizerjars", "ropenscilabs/tabulizer"))

library(tabulizer)

f2 <- "https://github.com/leeper/tabulizer/raw/master/inst/examples/data.pdf"
extract_tables(f2, pages = 1, method = "data.frame")