replaceAll不用于转义字符XML

时间:2017-09-27 19:57:49

标签: java json xml parsing

我正在尝试使用XMLJSON解析为JavaJSON.parse会在此字符上抛出此错误:

JSON.parse: bad control character in string literal

我尝试在将它们发送到JSON.parse之前替换这些字符,但这行代码不起作用。是否有更好的方法可以完全替换/删除这些字符?

String trim = desc.replaceAll("
", "\\n");

要解析的XML

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
    tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim 
    veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea 
    commodo consequat. Duis aute irure dolor in reprehenderit in voluptate 
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint 
    occaecat cupidatat non proident, sunt in culpa qui officia deserunt 
    mollit anim id est laborum.

2 个答案:

答案 0 :(得分:0)

当您显示的示例包含您拥有的完整XML输入时,您不会解析XML。

假设这是一个片段。你的解决方案只逃脱一件事,但要获得有效的JSON,你应该逃避JSON中不允许的所有字符,否则会导致不必要的行为。因此,寻找能够正确地为您逃脱JSON的东西是个好主意:

Java escape JSON String?

答案 1 :(得分:0)

想出来:

  public static String cleanDescription(String desc){

        String trim = desc.replaceAll("<.*?>", ""); //removes html elements
        //there's a phantom question mark that sometimes gets added to the the front and end of the string
        if(!Character.isLetter(trim.charAt(0))) trim = trim.substring(1, trim.length());

        Integer charCount = 0;
        for(int j = 1; j <= 3; j++){
            if(!Character.isLetter(trim.charAt(trim.length() - j)) &&
                    !Character.isDigit(trim.charAt(trim.length() - j))) charCount++;
        }
        if(charCount >= 2) trim = trim.substring(0, trim.length() - (charCount - 1));


        Pattern pt = Pattern.compile("[^a-zA-Z0-9()\\.\\,]");
        Matcher match= pt.matcher(trim);
        while(match.find())
        {
            String s = match.group();
            trim = trim.replaceAll("\\" + s, " ");
        }

        return trim.trim();
    }