在我的Java Web应用程序中,我有一个波斯语模板字词(docx)文档作为合同,可为我的用户自定义文档,并使用 APACHE-POI 我必须将其转换为pdf,以防止文件被操作员扭曲。 我尝试使用 itext 进行转换,但是我无法成功,也找不到有用的东西,有人可以建议使用itext进行转换的方法,还是可以告诉我是否还有其他方法可以防止转换文件不进行转换就不会失真?
编辑说明:我在下面的代码中做了转换,但是现在我的pdf文件中有很多问号,有人可以帮忙吗?itext支持波斯语还是RTL语言?我该如何解决这个问题? 我正在使用iText的5.0.6版本!
convertWordToPdf("D:/PrivateBanking/docxCo.docx","D:/PrivateBanking/docxCo.pdf");
public static void convertWordToPdf(String src, String desc){
try{
//create file inputstream object to read data from file
FileInputStream fs=new FileInputStream(src);
//create document object to wrap the file inputstream object
XWPFDocument doc=new XWPFDocument(fs);
//72 units=1 inch
Document pdfdoc=new Document(PageSize.A4,72,72,72,72);
//create a pdf writer object to write text to mypdf.pdf file
PdfWriter pwriter=PdfWriter.getInstance(pdfdoc, new FileOutputStream(desc));
//specify the vertical space between the lines of text
pwriter.setInitialLeading(20);
//get all paragraphs from word docx
List<XWPFParagraph> plist=doc.getParagraphs();
//open pdf document for writing
pdfdoc.open();
for (int i = 0; i < plist.size(); i++) {
//read through the list of paragraphs
XWPFParagraph pa = plist.get(i);
//get all run objects from each paragraph
List<XWPFRun> runs = pa.getRuns();
//read through the run objects
for (int j = 0; j < runs.size(); j++) {
XWPFRun run=runs.get(j);
//get pictures from the run and add them to the pdf document
List<XWPFPicture> piclist=run.getEmbeddedPictures();
//traverse through the list and write each image to a file
Iterator<XWPFPicture> iterator=piclist.iterator();
while(iterator.hasNext()){
XWPFPicture pic=iterator.next();
XWPFPictureData picdata=pic.getPictureData();
byte[] bytepic=picdata.getData();
Image imag=Image.getInstance(bytepic);
pdfdoc.add(imag);
}
//get color code
int color=getCode(run.getColor());
//construct font object
Font f=null;
if(run.isBold() && run.isItalic())
f= FontFactory.getFont(FontFactory.TIMES_ROMAN,run.getFontSize(),Font.BOLDITALIC, new BaseColor(color));
else if(run.isBold())
f=FontFactory.getFont(FontFactory.TIMES_ROMAN,run.getFontSize(),Font.BOLD, new BaseColor(color));
else if(run.isItalic())
f=FontFactory.getFont(FontFactory.TIMES_ROMAN,run.getFontSize(),Font.ITALIC, new BaseColor(color));
else if(run.isStrike())
f=FontFactory.getFont(FontFactory.TIMES_ROMAN,run.getFontSize(),Font.STRIKETHRU, new BaseColor(color));
else
f=FontFactory.getFont(FontFactory.TIMES_ROMAN,run.getFontSize(),Font.NORMAL, new BaseColor(color));
//construct unicode string
String text=run.getText(-1);
byte[] bs;
if (text!=null){
bs=text.getBytes();
String str=new String(bs,"UTF-8");
//add string to the pdf document
Chunk chObj1=new Chunk(str,f);
pdfdoc.add(chObj1);
}
}
//output new line
pdfdoc.add(new Chunk(Chunk.NEWLINE));
}
//close pdf document
pdfdoc.close();
}catch(Exception e){e.printStackTrace();}
}
public static int getCode(String code){
int colorCode;
if(code!=null)
colorCode=Long.decode("0x"+code).intValue();
else
colorCode=Long.decode("0x000000").intValue();
return colorCode;
}
答案 0 :(得分:1)
import com.documents4j.api.DocumentType;
import com.documents4j.api.IConverter;
import com.documents4j.job.LocalConverter;
import org.apache.commons.io.output.ByteArrayOutputStream;
import java.io.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class Converter{
public static void main(String[] args) throws IOException, ExecutionException, InterruptedException {
ByteArrayOutputStream bo = new ByteArrayOutputStream();
InputStream in = new BufferedInputStream(new FileInputStream("d:\\input.docx"));
IConverter converter = LocalConverter.builder()
.baseFolder(new File("D:\\input"))
.workerPool(20, 25, 2, TimeUnit.SECONDS)
.processTimeout(5, TimeUnit.SECONDS)
.build();
Future<Boolean> conversion = converter
.convert(in).as(DocumentType.MS_WORD)
.to(bo).as(DocumentType.PDF)
.prioritizeWith(1000) // optional
.schedule();
conversion.get();
try (OutputStream outputStream = new FileOutputStream("D:\\output.pdf")) {
bo.writeTo(outputStream);
} catch (IOException e) {
e.printStackTrace();
}
in.close();
bo.close();
}
}
这是necesarry maven dep:
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-api</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-util-conversion</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-util-all</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-local</artifactId>
<version>0.2.1</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.8.0-beta2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.8.0-beta2</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-util-standalone</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer-msoffice-word</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0</version>
</dependency>
但是您必须在运行之前安装ms office
享受!
答案 1 :(得分:0)
如果要使用 APACHE-POI 将docx转换为pdf,则需要使用具有合适版本的jar
org.apache.poi.xwpf.converter.core-x.x.x.jar
org.apache.poi.xwpf.converter.pdf-x.x.x.jar
如果您要使用其他库,则可以尝试 Docx4j 您可以在此处找到示例: https://www.docx4java.org/trac/docx4j
我希望这会有所帮助。