Question

有一个字符串：Character\5C&\22\3C\3E'我想要unescape。

有一个代码：

package escaping;

import org.apache.commons.lang.StringEscapeUtils;

public class UnEscapingDemo {

    public static void main(String[] args) {

       String str = StringEscapeUtils.unescapeHtml("Character\\5C&\\22\\3C\\3E'");

       System.out.println(str);

    }

}

但最后我还没有期待结果。我有同样的东西......（没有转换它）“。

为什么？

-

修改

我believe这里的“3E”代表“＆gt;” ..例如

所以，我期待的字符串是：Character\&"<>'

Answer 1

你提到的不是HTML而是URI编码。在HTML格式中，<为<，>为>。

你应该看看这个thread，并阅读Tim Cooper和Draemon的帖子。

Answer 2

嗯，这种奇怪的转义语法来自OpenLdap ...

这对我有用：

 public static void main(String[] args) throws UnsupportedEncodingException {

        String input = "Character\\5C&\\22\\3C\\3E'";

       input = input.replace("\\", "%");

       String result = URLDecoder.decode(input, "UTF-8");

       System.out.println(result);

    }

来自apache的StringEscapeUtil取消转换

2 个答案: