我将pdf文件拆分为单个页面R.但是在生成文件后,文件名包含与文件长度相同的零个数。因此,假设我的pdf文件页面为10个,然后在文件名中附加10个零然后序列化数字。所以对于小文件来说,它的工作正常,但是当我尝试拆分包含1000个或更多的pdf时,我的代码中断了,因为它首先尝试添加1000个时间零,然后添加序列名,因此任何人都可以帮我这个忙。
分割包含800页以上的文件时出错
cpp_pdf_split(输入,输出,密码)错误:打开 C:/用户/桌面/页面 .pdf_00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.pdf:没有这样的文件或目录
以下是我的代码:
install.packages("qpdf")
library(qpdf)
pdf_split(file.choose(),output = NULL)
我希望仅通过页面序列号创建文件名,或者希望使用一种解决方案来删除这些不需要的零。
答案 0 :(得分:1)
library(qpdf)
library(textreadr)
filePath <- file.choose()
pdfInputFile <- read_pdf(filePath)
for(i in 1:nrow(pdfInputFile))
{
pdf_subsets(filePath,pages = i,Output=paste0(gsub(basename(filePath),pattern=".pdf",replacement=""),"_",i,".pdf"))
}