JSON Jackson + HTTPClient与德国变音符号

时间:2013-01-10 00:15:27

标签: java android json httpclient

我遇到了关于json字符串的问题,我使用Apache http客户端获取,包含德语变音符号。

如果字符串不包含任何德语变音符号,则json字符串的映射仅起作用,否则我得到“JsonMappingException:无法反序列化START_ARRAY中的[...]实例。”

Apache http客户端使用“Accept-Charset”设置为HTTP.UTF-8,但结果我总是得到例如“\ u00fc”而不是“ü”。我手动更换时,例如“\ u00fc”与“ü”的映射效果很好。

如何从Apache http客户端获取utf-8编码的json响应? 或服务器输出问题?

params.setParameter(HttpProtocolParams.USE_EXPECT_CONTINUE, false);
HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(params, HTTP.UTF_8);
httpclient = new DefaultHttpClient(params);
httpclient = new DefaultHttpClient(params);
HttpGet httpGetContentLoad = new HttpGet(url);
httpGetContentLoad.setHeader("Accept-Charset", "utf-8");
httpGetContentLoad.setParams(params);
response = httpclient.execute(httpGetContentLoad);
entity = response.getEntity();
String loadedContent = null;
if (entity != null)
{
   loadedContent = EntityUtils.toString(entity, HTTP.UTF_8);
   entity.consumeContent();
}
if (HttpStatus.SC_OK != response.getStatusLine().getStatusCode())
{
    throw new Exception("Loading content failed");
}
closeConnection();
return loadedContent;

这里映射了json代码:

String jsonMetaData = loadGetRequestContent(getLatestEditionUrl(newspaperEdition));
Newspaper loadedNewspaper = mapper.readValue(jsonMetaData, Newspaper.class);
loadedNewspaper.setEdition(newspaperEdition);

更新1: JsonMetaData是包含所提取的json代码的String类型。

UPDATE2:

我用这个代码将json输出转换为我需要:

public static String convertJsonLatestEditionMeta(String jsonCode)
{
    jsonCode = jsonCode.replaceFirst("\\[\"[A-Za-z0-9-[:blank:]]+\",\\{", "{\"edition\":\"an-a1\",");
    jsonCode = jsonCode.replaceFirst("\"pages\":\\{", "\"pages\":\\[");
    jsonCode = Helper.replaceLast(jsonCode, "}}}]", "}]}");
    jsonCode = jsonCode.replaceAll("\"[\\d]*\"\\:\\{\"", "\\{\"");
    return jsonCode;
}

UPDATE3: Json转换示例:

转换前的

jsoncode:

["Newspaper title",
{
    "date":"20130103",
"pages":
            {
            "1":  {"ressort":"ressorttitle1","pdfpfad":"pathToPdf1","number":1,"size":281506},
            "2":{"ressort":"ressorttitle2","pdfpfad":"pathToPdf2","number":2,"size":281533},
            [...]
        }
    }
]

转换后的Jsoncode:

{   
"edition":"Newspaper title",
"date":"20130103",
    "pages":
    [
       {"ressort":"Resorttitle1","pdfpfad":"pathToPdf1","number":1,"size":281506},
       {"ressort":"Resorttitle2","pdfpfad":"pathToPdf2","number":2,"size":281533},
       [...]
    ]
}

解决方案: 我开始使用GSON作为@Boris建议,关于变音符号的问题消失了!更多的GSON似乎比Jackson Json更快。

解决方法是在此表后手动替换字符:

Sign        Unicode representation

Ä, ä        \u00c4, \u00e4
Ö, ö        \u00d6, \u00f6
Ü, ü        \u00dc, \u00fc
ß           \u00df
€           \u20ac

1 个答案:

答案 0 :(得分:2)

尝试解析:

entity = response.getEntity();
Newspaper loadedNewspaper=mapper.readValue(entity.getContent(), Newspaper.class);

没有理由通过String,杰克逊直接解析InputStream。如果您使用我提出的方法,杰克逊也会自动检测编码。

编辑顺便考虑使用GSON JSON解析库。它比杰克逊更快,更容易使用。然而,杰克逊最近也开始解析XMl,这是一种美德。

EDIT2 毕竟你已经添加了详细信息我认为问题出在服务器的服务器实现上 - 变音符号不能在json中进行unicode转义 - UTF 8是本机编码它。为什么不通过正则表达式代替manually replace e.g. "\u00fc" with "ü"