在Java中将HTML转换为RTF?

时间:2014-09-21 07:13:57

标签: java html converter rtf

我需要将HTML转换为RTF,我正在使用此代码:

private static String convertToRTF(String htmlStr) {
    OutputStream os = new ByteArrayOutputStream();
    HTMLEditorKit htmlEditorKit = new HTMLEditorKit();
    RTFEditorKit rtfEditorKit = new RTFEditorKit();
    String rtfStr = null;
    htmlStr = htmlStr.replaceAll("<br.*?>", "#NEW_LINE#");
    htmlStr = htmlStr.replaceAll("</p>", "#NEW_LINE#");
    htmlStr = htmlStr.replaceAll("<p.*?>", "");
    InputStream is = new ByteArrayInputStream(htmlStr.getBytes());
    try {
        Document doc = htmlEditorKit.createDefaultDocument();
        htmlEditorKit.read(is, doc, 0);
        rtfEditorKit.write(os, doc, 0, doc.getLength());
        rtfStr = os.toString();
        rtfStr = rtfStr.replaceAll("#NEW_LINE#", "\\\\par ");
    } catch (IOException e) {
        e.printStackTrace();
    } catch (BadLocationException e) {
        e.printStackTrace();
    }
    return rtfStr;
}

问题是当我尝试转换具有这样的项目符号或数字的HTML时:

  
      
  1. 一个
  2.   
  3. 2
  4.   

这是HTML:

<html><head>
    <style>
      <!--
      -->
    </style>
  </head>
  <body contenteditable="true">
     <p style="text-align: left;">
         <ol>
             <li><font face="'Segoe UI'">one</font></li>
             <li><font face="'Segoe UI'">two</font></li>
         </ol>
   </p>

这就是转换结果:

  

ONETWO

RTF:

{\rtf1\ansi
{\fonttbl\f0\fnil Monospaced;\f1\fnil 'Segoe UI';}

\par
\f1 one\f1 two\par \par
}

如何转换数字和项目符号?

1 个答案:

答案 0 :(得分:2)

这些库可能会有所帮助: