不从http网站网址生成pdf

时间:2012-01-08 15:25:20

标签: java pdf

我想使用servelt从url生成pdf 我正在使用Flying Saucer从网站网址生成pdf 我的代码是

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */
package abc;

import com.lowagie.text.DocumentException;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.xhtmlrenderer.pdf.ITextRenderer;

public class TestPdf {
    public static void main(String args[]) throws MalformedURLException
    {
      String File_To_Convert = "test.htm";
        //String url = new File(File_To_Convert).toURI().toURL().toString();
        URL testurl=new URL("http://www.ipillion.com/");
        String url=testurl.toString();
        System.out.println(""+url);
        String HTML_TO_PDF = "d:\\ConvertedFile.pdf";
        OutputStream os;       
        try {
            os = new FileOutputStream(HTML_TO_PDF);


        ITextRenderer renderer = new ITextRenderer();
        renderer.setDocument(url);      
        renderer.layout();

        renderer.createPDF(os);

        os.close();

}catch(Exception e)
{
    e.printStackTrace();
}
    }}

我尝试过使用不同的网站,但每次都会抛出异常 例如

run:
http://www.ipillion.com/
ERROR:  'The element type "SCRIPT" must be terminated by the matching end-tag "</SCRIPT>".'
org.xhtmlrenderer.util.XRRuntimeException: Can't load the XML resource (using TRaX transformer). org.xml.sax.SAXParseException; lineNumber: 12; columnNumber: 71; The element type "SCRIPT" must be terminated by the matching end-tag "</SCRIPT>".
    at org.xhtmlrenderer.resource.XMLResource$XMLResourceBuilder.createXMLResource(XMLResource.java:191)
    at org.xhtmlrenderer.resource.XMLResource.load(XMLResource.java:71)
    at org.xhtmlrenderer.swing.NaiveUserAgent.getXMLResource(NaiveUserAgent.java:211)
    at org.xhtmlrenderer.pdf.ITextRenderer.loadDocument(ITextRenderer.java:134)
    at org.xhtmlrenderer.pdf.ITextRenderer.setDocument(ITextRenderer.java:138)
    at abc.TestPdf.main(TestPdf.java:33)
Caused by: javax.xml.transform.TransformerException: org.xml.sax.SAXParseException; lineNumber: 12; columnNumber: 71; The element type "SCRIPT" must be terminated by the matching end-tag "</SCRIPT>".
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:723)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:317)
    at org.xhtmlrenderer.resource.XMLResource$XMLResourceBuilder.createXMLResource(XMLResource.java:189)
    ... 5 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 12; columnNumber: 71; The element type "SCRIPT" must be terminated by the matching end-tag "</SCRIPT>".
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1236)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:640)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:711)
    ... 7 more
BUILD SUCCESSFUL (total time: 9 seconds)

2 个答案:

答案 0 :(得分:0)

您使用的是 XHTML 渲染器(org.xhtmlrenderer.pdf.ITextRenderer),但该网站位于 HTML

答案 1 :(得分:0)

问题在于输入,它与您正在使用的渲染器不兼容。

我要么选择另一个渲染器,要么整理输入,以便它是有效的XHTML。通过http://tidy.sourceforge.net/抽取内容应该有效。