我的任务是将下图1所示的所有pdf文件转换为单个csv文件,即CSV文件中的一行包含一个pdf文档。我使用以下代码,但我很挣扎。您的帮助和评论将不胜感激。
谢谢
# Convert multiple pdf files to CSV files before mining
install.packages('pdftools')
install.packages('xlsx')
# Relevant libraries
library("pdftools")
library("xlsx")
#Set up a path
a<-"my path"
folder<-list.files(path=a,pattern="pdf",full.name=TRUE)
sapply(folder, FUN=function(i){
file.rename(from=i,to =paste0(dirname(i),
"/",gsub(" ","",basename(i))))})
folder1<-list.files(path=a,pattern="pdf",full.names=TRUE)
lapply(folder1, function(i) system(paste('"C:/Program
Files/xpdf/bin64/pdftotext.exe"', paste0('"', i, '"')), wait = FALSE) )