iText不能使用java正确提取Shruti文本

时间:2017-04-21 17:12:10

标签: java pdf text itext

我想从pdf文件中提取Shruti Text并编写新的pdf。我在用 iText但它无法提取正确的文本,那么它的解决方案是什么?

我正在使用iText 5.4 lib

iText显示在新的pdf','' - ','_'和空白的Shruti字体文本

Code That I am Using is,
//for extract text From pdf
  try {
     PdfReader pdfreader = new PdfReader(file path,password);
     String iTextContent = PdfTextExtractor.getTextFromPage(pdfreader,1);
  } catch (IOException ex) {
     Logger.getLogger(JFileChooserDemo.class.getName()).log(Level.SEVERE, null, ex);
  }



  //write new pdf file
   try{
       Document docNew = new Document();     
       PdfWriter writer = PdfWriter.getInstance(docNew,new FileOutputStream("D:\\demo.pdf"));
     docNew.open();

     BaseFont bf = BaseFont.createFont("D:\\DeskTop\\Pdf Box jar\\shruti.ttf", BaseFont.IDENTITY_H,BaseFont.NOT_EMBEDDED);
     Font f = new Font(bf,5);
     docNew.add(new Paragraph(newStr,f));

     docNew.close();
     writer.close();
   }catch(Exception e){
      e.printStackTrace();
    } 

0 个答案:

没有答案