我试图通过翻译我在这里找到的splitPDF方法从pdf中提取单个页面http://viralpatel.net/blogs/itext-tutorial-merge-split-pdf-files-using-itext-jar/
我一直收到此错误
IOException Stream Closed java.io.FileOutputStream.writeBytes(: - 2)
这可以防止我在repl仍处于打开状态时打开文档。关闭repl后,我就可以访问该文档了。
为什么我会收到错误?
我该如何解决?
我怎样才能让它变得更加狡猾?
(import '(com.itextpdf.text Document)
'(com.itextpdf.text.pdf PdfReader PdfWriter PdfContentByte PdfImportedPage BaseFont)
'(java.io File FileInputStream FileOutputStream InputStream OutputStream))
(defn extract-page [src dest pagenum]
(with-open [ d (Document.)
os (FileOutputStream. dest)]
(let [ srcpdf (->> src FileInputStream. PdfReader.)
destpdf (PdfWriter/getInstance d os)]
(doto d
(.open )
(.newPage ))
(.addTemplate
(.getDirectContent destpdf)
(.getImportedPage destpdf srcpdf pagenum) 0 0))))
答案 0 :(得分:3)
您忘了关闭文档:
(close. d)
以下代码有效:
(import '(com.itextpdf.text Document)
'(com.itextpdf.text.pdf PdfReader PdfWriter PdfContentByte PdfImportedPage BaseFont)
'(java.io File FileInputStream FileOutputStream InputStream OutputStream))
(defn extract-page [src dest pagenum]
(with-open [ is (FileInputStream. src)
os (FileOutputStream. dest)]
(let [ srcpdf (PdfReader. src)
d (Document.)
destpdf (PdfWriter/getInstance d os)]
(doto d
(.open )
(.newPage ))
(println "Number of pages" (.getNumberOfPages srcpdf))
(.addTemplate
(.getDirectContent destpdf)
(.getImportedPage destpdf srcpdf pagenum) 0 0)
(.close d))))
编辑:
如果您感兴趣,我发现使用apache pdfbox更容易。
(import '(org.apache.pdfbox.pdmodel PDDocument)
'(org.apache.pdfbox.util PDFTextStripper)
'(java.io File OutputStreamWriter FileOutputStream BufferedWriter))
(defn convert-to-text [src dest]
(with-open [ pd (PDDocument/load (File. src))
wr (BufferedWriter. (OutputStreamWriter. (FileOutputStream. (File. dest))))]
(let [ stripper (PDFTextStripper.)]
(println "Number of pages" (.getNumberOfPages pd))
(.writeText stripper pd wr))))