我有一个俄语字符串,我编码为UTF-8
String str = "\u041E\u041A";
System.out.println("String str : " + str);
当我在eclipse控制台中打印字符串时,我得到??
任何人都可以建议如何将俄语字符串打印到控制台或我在这里做错了什么?
我尝试使用byte myArr[] = str.getBytes("UTF-8")
将其转换为字节,然后new String(myArr, "UTF-8")
仍有同样的问题: - (
答案 0 :(得分:6)
试试这个:
String myString = "some cyrillic text";
byte bytes[] = myString.getBytes("ISO-8859-1");
String value = new String(bytes, "UTF-8");
或者这个:
String myString = "some cyrillic text";
byte bytes[] = myString.getBytes("UTF-8");
String value = new String(bytes, "UTF-8");
俄罗斯正确设置UTF-8编码的主要问题。
答案 1 :(得分:2)
在eclipse中转到Run>运行配置>常见>将控制台编码更改为UTF-8。您将能够在控制台中看到俄语字符
答案 2 :(得分:1)
一个问题:您的控制台是否能够显示俄语字符?
答案 3 :(得分:0)
控制台的显示字体很可能无法处理非ASCII字符。
您可以尝试打印到文件而不是System.out
答案 4 :(得分:0)
我的Eclipse正确打印
String str : ОК
尝试将运行配置编码更改为UTF-8或CP1251
答案 5 :(得分:0)
这是一个古老的话题,但是尽管如此,也许下面的内容会有所帮助。
如果您正在使用包含西里尔符号的InputStream / InputStreamReader
(例如,从某些API中读取数据)并且发现了一些类似������ ���
或?????? ???
的乱码,请尝试将编码Charset应用于InputStreamReader
构造函数的第二个参数。
示例:
让我们使用俄罗斯中央银行API来获取美元和欧元的俄罗斯卢布价格。在下面的代码中,我们获取当日请求时的数据。来自API的数据位于xml
中,因此我们还需要对其进行解析。
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.nio.charset.Charset;
public class CBRFApi {
public static void main(String[] args) throws UnsupportedEncodingException {
String output = getAndReadData("http://www.cbr.ru/scripts/XML_daily.asp");
Document document = loadXMLFromString(output);
// getting root element
Node root = document.getDocumentElement();
NodeList currencies = root.getChildNodes();
// just for further reference
Node usDollar;
Node euro;
for (int i = 0; i < currencies.getLength(); i++) {
Node currency = currencies.item(i);
String key = currency.getAttributes().getNamedItem("ID").getNodeValue();
if (key.equals("R01235") || key.equals("R01239")) {
if (key.equals("R01235")) // US dollar ID
usDollar = currency;
else if (key.equals("R01239")) // Euro ID
euro = currency;
NodeList currencySpecs = currency.getChildNodes();
System.out.print(currencySpecs.item(1).getTextContent());
System.out.print(" " + currencySpecs.item(3).getTextContent());
System.out.print(" " + currencySpecs.item(4).getTextContent());
System.out.println();
}
}
}
public static String getAndReadData(String link) {
String output = "";
try {
URL url = new URL(link);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "application/xml");
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Linux; Android 4.2.2; en-us; SAMSUNG GT-I9505 Build/JDQ39) " +
"AppleWebKit/535.19 (KHTML, like Gecko) Version/1.0 Chrome/18.0.1025.308 Mobile Safari/535.19.");
if (conn.getResponseCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ conn.getResponseCode());
}
// below is the key line,
// without second parameter - Charset.forName("CP1251") -
// data in Cyrillic will be returned like ������ ���
InputStreamReader inputStreamReader = new InputStreamReader(conn.getInputStream(), Charset.forName("CP1251"));
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String line;
while ((line = bufferedReader.readLine()) != null) {
output += line;
}
conn.disconnect();
return output;
} catch (MalformedURLException e) {
e.printStackTrace();
return null;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
public static Document loadXMLFromString(String xml)
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
builder = factory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(xml));
return builder.parse(inputSource);
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
return null;
}
}
}
所以正确的输出是:
USD Доллар США 63,3791
EUR Евро 70,5980
并且没有指出Charset.forName("CP1251")
:
USD ������ ��� 63,3791
EUR ���� 70,5980
当然,您所用的实际编码可能与CP1251
不同,因此,如果此编码不起作用,请尝试其他编码。
答案 6 :(得分:0)
当我阅读带有俄文字母的“MyFile.txt”文件时,我遇到了同样的问题。可能对任何人都有帮助。解决办法是:
package j;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class J4 {
public static void Read_TXT_File(String fileName) throws
FileNotFoundException {
try{int i=0;
Scanner scanner = new Scanner(new File(fileName), "utf-8");
while (scanner.hasNext()) {
String line = scanner.nextLine();
//byte bytes[] = line.getBytes("UTF-8");
//line = new String(bytes, "UTF-8");
if (line.isEmpty()) {
System.out.println(i+": Empty line");
}
else {
System.out.println(i+": "+ line);
// here is your code for example String MyString = line
}
i++;
}
}catch(Exception ex){ex.printStackTrace();}
}
public static void main(String[] args) throws
FileNotFoundException {
Read_TXT_File("MyFile.txt");
}
}
答案 7 :(得分:-1)