我想要使用itext读取包含波斯语字符的pdf文件。我从这里读到,但言语是相反的。例如“ره”而不是“هر”。 我用“\ n”分割它并从末尾读取每行中的每个文本,但我认为可能有更好的解决方案从这个Pdf读取。 那是我的代码:
public class Main extends JFrame {
private static final int WIDTH = 600;
private static final int HEIGHT = 600;
/**
* by Shomeis
*/
private static final long serialVersionUID = 1L;
public Main() {
Dimension dim = Toolkit.getDefaultToolkit().getScreenSize();
int x = dim.width / 2 - WIDTH / 2;
int y = dim.height / 2 - HEIGHT / 2;
setBounds(x, y, WIDTH, HEIGHT);
setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
setMinimumSize(new Dimension(600, 600));
//
File pdf = new File("E:\\guide1.pdf");
if (!pdf.canRead() || !pdf.isFile()) {
System.err.println("cannot read input file " + pdf.getAbsolutePath());
return;
}
try {
PdfReader reader = new PdfReader(pdf.getAbsolutePath());
String page;
String areaText = "";
System.out.println(reader.getNumberOfPages());
for (int k = 1; k <= reader.getNumberOfPages(); k++) {
System.out.println(k);
page = PdfTextExtractor.getTextFromPage(reader, k);
String[] b = page.split("\n");
for (int i = 0; i < b.length; i++) {
for (int j = (b[i].length() - 1); j >= 0; j--) {
areaText += b[i].charAt(j);
}
areaText += "\n";
}
}
JTextArea text = new JTextArea(areaText);
JScrollPane sc = new JScrollPane(text);
text.setWrapStyleWord(true);
text.setComponentOrientation(ComponentOrientation.RIGHT_TO_LEFT);
this.setContentPane(sc);
this.setVisible(true);
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
new Main().setVisible(true);
}
}
答案 0 :(得分:0)
你可以改掉这些词:
String res = strategy.getResultantText();
res = new StringBuilder(res).reverse().toString();