如何从JEditorPane获取正确的Unicode字符串(no& #enties;)?

时间:2014-06-04 03:49:51

标签: java unicode html-entities jeditorpane

我需要从jtextpane获取文本,格式为设置文本中的输入

  private void test() {
      myFrame = new JFrame("JEditorPane Unicode Test");
      myFrame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
      myFrame.setSize(300,200);

      myPane = new JEditorPane();
      myPane.setContentType("text/html;charset=utf-8");
      myPane.setText(
         "Hello computer! - \u7535\u8111\u4F60\u597D\uFF01\n"
         + "Welcome to Herong's Website!\n"
         + "\u6B22\u8FCE\u4F60\u8BBF\u95EE\u548C\u8363\u7F51\u7AD9"
         + "\uFF01\nwww.herongyang.com <br>பாரதீய ஜனதா இளைஞர் அணி <b>தலைவர் அனுராக்சிங் தாகூர் எம்.பி.<b> நேற்று தேர்தல் <i>ஆணையர் வி.சம்பத்தை<i>");
      myFrame.setContentPane(myPane);
      String test = myPane.getText();
       try {
        JOptionPane.showMessageDialog(null, "myPane.gettext --> "+myPane.getText());
           System.out.println("myPane.getText() -->"+myPane.getText());
       } catch (Exception ex) {
           Logger.getLogger(JEditorPaneUnicode.class.getName()).log(Level.SEVERE, null, ex);
       }


    @Override
    public void actionPerformed(ActionEvent e) {
        throw new UnsupportedOperationException("Not supported yet."); //To change body of generated methods, choose Tools | Templates.
    }
}

但是我的get文本输出如下所示

myPane.getText()

<html>
   <head>

   </head>
   <body>
   Hello computer! - &#30005;&#33041;&#20320;&#22909;&#65281; Welcome to Herong's Website! &#27426;&#36814;&#20320;&#35775;&#38382;&#21644;&#33635;&#32593;&#31449;&#65281; 
www.herongyang.com<br>&#2986;&#3006;&#2992;&#2980;&#3008;&#2991; &#2972;&#2985;&#2980;&#3006; &#2951;&#2995;&#3016;&#2974;&#2992;&#3021; &#2949;&#2979;&#3007; <b>&#2980;&#2994;&#3016;&#2997;&#2992;&#3021; 
&#2949;&#2985;&#3009;&#2992;&#3006;&#2965;&#3021;&#2970;&#3007;&#2969;&#3021; &#2980;&#3006;&#2965;&#3010;&#2992;&#3021; &#2958;&#2990;&#3021;.&#2986;&#3007;. 
&#2984;&#3015;&#2993;&#3021;&#2993;&#3009; &#2980;&#3015;&#2992;&#3021;&#2980;&#2994;&#3021; <i>&#2950;&#2979;&#3016;&#2991;&#2992;&#3021; 
&#2997;&#3007;.&#2970;&#2990;&#3021;&#2986;&#2980;&#3021;&#2980;&#3016;</i>    </b>
  </body>
  </html>

我的预期输出是

  <html>
  <head>

  </head>
  <body>
  Hello computer! - 电脑你好! Welcome to Herong's Website! 欢迎你访问和荣网站! www.herongyang.com <br>பாரதீய ஜனதா இளைஞர் அணி </b>தலைவர் அனுராக்சிங் தாகூர் எம்.பி.<b> நேற்று தேர்தல் <i>ஆணையர் வி.சம்பத்தை</i>
  </body>
  </html>

1 个答案:

答案 0 :(得分:2)

您可以使用以下方法从编辑器窗格中输入html格式。使用gettext()

无法获取unicode文本
    doc = (HTMLDocument) myPane.getDocument();
    StringBuilder sb = new StringBuilder();
    javax.swing.text.Element[] styles = doc.getRootElements();
    for (int i = 0; i < styles.length; i++) {
        int size = styles[i].getElementCount();
        if (!styles[i].getName().contains("bidi root")) {
            //   System.out.println("<"+styles[i].getName()+">");
            //  sb.append("<"+styles[i].getName()+">");
        }
        for (int j = 0; j < size; j++) {
            String element = styles[i].getElement(j).getName();
            if (element.equals("body")) {
                int subsize = styles[i].getElement(j).getElementCount();
                for (int k = 0; k < subsize; k++) {
                    element = styles[i].getElement(j).getElement(k).getName();
                    if (element.equals("p-implied")) {
                        int subsubsize = styles[i].getElement(j).getElement(k).getElementCount();
                        String cond = "fail", boldc = "</b>", boldi = "</i>";
                        for (int l = 0; l < subsubsize; l++) {
                            javax.swing.text.Element elem = styles[i].getElement(j).getElement(k).getElement(l);
                            element = elem.getName();
                            if (!element.contains("content")) {
                                //    System.out.println("<"+element+">");
                                sb.append("<" + element + ">");
                            }
                            if (element.equals("content")) {
                                AttributeSet attributes = elem.getAttributes();
                                Enumeration attrs = attributes.getAttributeNames();
                                while (attrs.hasMoreElements()) {
                                    String rft = attrs.nextElement().toString();
                                    if (rft.equals("b")) {
                                        //  System.out.println("<"+rft+">");
                                        sb.append("<" + rft + ">");
                                        cond = "passb";
                                    } else if (rft.equals("i")) {
                                        sb.append("<" + rft + ">");
                                        cond = "passi";
                                    }
                                }
                            }
                            try {
                                //      System.out.println( elem.getDocument().getText(elem.getStartOffset(), (elem.getEndOffset() - elem.getStartOffset())));
                                sb.append(elem.getDocument().getText(elem.getStartOffset(), (elem.getEndOffset() - elem.getStartOffset())));
                            } catch (BadLocationException ex) {
                                Logger.getLogger(GridEditor.class.getName()).log(Level.SEVERE, null, ex);
                            }

                            if (cond.equals("passi")) {
                                //    System.out.println( boldi);
                                sb.append(boldi);
                            }
                            if (cond.equals("passb")) {
                                // System.out.println( boldc);
                                sb.append(boldc);
                            }

                            cond = "fail";

                        }
                    }
                }
            }


        }

    }

    String text = sb.toString();
    System.out.println("final string --> "+text);

此字符串将提供您必要的输出