有没有办法将非ascii字符转换为unicode并保留原样ascii?

时间:2016-01-07 00:15:48

标签: java unicode encoding uri apache-httpclient-4.x

我刚刚发现,apache httpclient返回错误解码的位置标头,如果它包含编码字母百分比。

enter image description here

当浏览器中的相同请求返回正确的字符串时:

enter image description here

我写了一个恢复uri的方法。我写得对吗?有简单的方法吗?

import java.net.URLDecoder;

public class Test {
    public static void main(String[] args) throws Exception {
        String uri = "/search-zero?searchterm=\u00D1\u008C";
        String converted = convert(uri);
        System.out.println(converted); // /search-zero?searchterm=%D1%8C
        System.out.println(URLDecoder.decode(converted, "utf-8")); // /search-zero?searchterm=ь
    }

    private static String convert(String uri) {
        char[] chars = uri.toCharArray();
        int i = 0;
        StringBuilder result = new StringBuilder();
        while (i < chars.length) {
            int n = (int) chars[i];
            if (n > 127) {
                result.append('%');
                result.append(String.format("%02X", n));
            } else {
                result.append(chars[i]);
            }
            i++;
        }
        return result.toString();
    }
}

更新

我目前的HttpClient配置:

@Bean
public CloseableHttpClient getHttpClient() {
    ConnectionConfig connectionConfig = ConnectionConfig.custom().setCharset(Consts.UTF_8).build();

    PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
    cm.setMaxTotal(200);
    cm.setDefaultMaxPerRoute(20);

    return HttpClients.custom()
            .setDefaultConnectionConfig(connectionConfig)
            .setConnectionManager(cm)
            .setRedirectStrategy(new CustomRedirectStrategy())
            .build();
}

public class CustomRedirectStrategy extends DefaultRedirectStrategy {

    @Override
    public URI getLocationURI(HttpRequest request, HttpResponse response, HttpContext context) throws ProtocolException {
        System.out.println(response.getFirstHeader("location"));
        URI uri = super.getLocationURI(request, response, context);
        return uri;
    }
}

工作代码(我们需要正确设置定制连接管理器或者只是删除它)感谢OLEG !!

    @Bean
    public CloseableHttpClient getHttpClient() {
        ConnectionConfig connectionConfig = ConnectionConfig.custom().setCharset(Consts.UTF_8).build();

//        PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
//        cm.setMaxTotal(200);
//        cm.setDefaultMaxPerRoute(20);

        return HttpClients.custom()
                .setDefaultConnectionConfig(connectionConfig)
//                .setConnectionManager(cm)
                .setRedirectStrategy(new CustomRedirectStrategy())
                .build();
    }

1 个答案:

答案 0 :(得分:1)

可以强制HttpClient为协议元素使用非标准字符集,这样可以改善与破坏的Web服务器的互操作性,这些服务器包括未定位的非ASCII字符,位于&#39;位置&#39;标题

ConnectionConfig connectionConfig = ConnectionConfig.custom()
        .setCharset(Consts.ISO_8859_1)
        .build();
CloseableHttpClient client = HttpClients.custom()
        .setDefaultConnectionConfig(connectionConfig)
        .build();