java:带有西里尔字符的HttpURLConnection

时间:2015-12-31 11:33:27

标签: java web-services encoding

我需要对Web服务进行soap请求。只有2个函数,所以我决定使用简单的HttpURLConnection来加速开发。 这是测试代码(不要害怕try / catch,它只是一个测试)。

HttpURLConnection connection = null;
    URL url = null;
    try {
        url = new URL("http://doc.ssau.ru/ssau_biblioteka_test/ws/DspaceIntegration.1cws");
    } catch (MalformedURLException e) {
        e.printStackTrace();
    }

    try {
        connection = (HttpURLConnection)url.openConnection();
    } catch (IOException e) {
        e.printStackTrace();
    }
    connection.setRequestProperty("Authorization", "Basic d2Vic2VydmljZTp3ZWJzZXJ2aWNl");
    connection.setRequestProperty("Content-Type", "application/json;charset=UTF-8");
    connection.setRequestProperty("Accept-Encoding", "gzip,deflate");
    try {
        connection.setRequestMethod("POST");
    } catch (ProtocolException e) {
        e.printStackTrace();
    }
    connection.setDoOutput(true);

    DataOutputStream wr = null;
    try {
        wr = new DataOutputStream(
                connection.getOutputStream());
    } catch (IOException e) {
        e.printStackTrace();
    }

    String myString = "RU/НТБ СГАУ/WALL/Х62/С 232-948516";
    byte bytes[] = new byte[0];
    try {
        bytes = myString.getBytes("UTF-8");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
    String value = "";
    try {
         value = new String(bytes, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
    try {
        wr.writeBytes("<soap:Envelope xmlns:soap=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:imc=\"http://imc.parus-s.ru\">\n" +
                "   <soap:Header/>\n" +
                "   <soap:Body>\n" +
                "      <imc:GetRecordsInfo>\n" +
                "         <imc:Codes>RU/НТБ СГАУ/WALL/Х62/С 232-948516</imc:Codes>\n" +
                "         <imc:Separator>?</imc:Separator>\n" +
                "         <imc:Type>?</imc:Type>\n" +
                "      </imc:GetRecordsInfo>\n" +
                "   </soap:Body>\n" +
                "</soap:Envelope>");
    } catch (IOException e) {
        e.printStackTrace();
    }
    try {
        wr.close();
    } catch (IOException e) {
        e.printStackTrace();
    }

    InputStream is = null;
    try {
        is = connection.getInputStream();
    } catch (IOException e) {
        e.printStackTrace();
    }
    BufferedReader rd = new BufferedReader(new InputStreamReader(is));
    StringBuilder response = new StringBuilder(); // or StringBuffer if not Java 5+
    String line;
    try {
        while((line = rd.readLine()) != null) {
            System.out.println(line);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    try {
        rd.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

所以这是一个问题: 当我用带有cyrillyc字符的wirteByes,例如“RU /НТБСГАУ/ WALL /Х62/С232-948516”时,Web服务返回500.如果我只使用拉丁语或留空,那么everthing就可以了。 编码cyrillyc的正确方法是什么?

更新 问题解决了,使用了这种结构:

wr.write(new String("<soap:Envelope xmlns:soap=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:imc=\"http://imc.parus-s.ru\">\n" +
                "   <soap:Header/>\n" +
                "   <soap:Body>\n" +
                "      <imc:GetRecordsInfo>\n" +
                "         <imc:Codes>RU/НТБ СГАУ/WALL/Х62/С 232-948516</imc:Codes>\n" +
                "         <imc:Separator>?</imc:Separator>\n" +
                "         <imc:Type>?</imc:Type>\n" +
                "      </imc:GetRecordsInfo>\n" +
                "   </soap:Body>\n" +
                "</soap:Envelope>").getBytes(charset));

1 个答案:

答案 0 :(得分:0)

不完全确定可能出现的问题,您可能会尝试一些事情。

1)尝试不同的编码。以下代码可能有所帮助。

Charset charset = Charset.forName("UTF-16");    
byte[] encodedBytes = myString.getBytes(charset);

2)你可以尝试的另一件事是猜测你的字符串的字符集。(请注意:每次都无法100%正确地猜出字符集)

byte[] thisAppCanBreak = "this app can break"
        .getBytes("ISO-8859-1");
    CharsetDetector detector = new CharsetDetector();
    detector.setText(thisAppCanBreak);
    String tableTemplate = "%10s %10s %8s%n";
    System.out.format(tableTemplate, "CONFIDENCE",
        "CHARSET", "LANGUAGE");
    for (CharsetMatch match : detector.detectAll()) {
      System.out.format(tableTemplate, match
          .getConfidence(), match.getName(), match
          .getLanguage());
    }

This link也可能有帮助