我试图读取UTF8文件并将其转换为CP850(发送到打印机设备)。 我的测试字符串是“ATIVAÇÃO”
A T I V A Ç Ã O
0x41 0x54 0x49 0x56 0x41 0xC3 0x87 C3 0x83 4F
我的java代码:
private static void printBytes(String s, String st) {
byte[] b_str = s.getBytes();
System .out.print(String.format("%-7s >>> ", st));
for (int i=0; i<s.length();i++)
System.out.print(String.format("%-7s ", s.charAt(i)));
System.out.println();
System .out.print(String.format("%-7s >>> ", st));
for (int i=0; i<b_str.length;i++)
System.out.print(String.format("0x%-5x ", (int)b_str[i] & 0xff));
System.out.println();
}
public static void main(String [] args) throws Exception, Exception {
String F="file.txt";
InputStreamReader input = new InputStreamReader(new FileInputStream(F));
BufferedReader in = new BufferedReader(input);
String strFILE;
String strCP850;
while ((strFILE = in.readLine()) != null) {
strFILE = strFILE.substring(3);
printBytes(strFILE, "ORI");
strCP850 = new String(strFILE.getBytes(), "CP850");
printBytes(strCP850, "CP850");
System.exit(0);
}
in.close();
}
输出:
ORI >>> A T I V A Ã ‡ Ã ƒ O
ORI >>> 0x41 0x54 0x49 0x56 0x41 0xc3 0x87 0xc3 0x83 0x4f
CP850 >>> A T I V A ? ç ? â O
CP850 >>> 0x41 0x54 0x49 0x56 0x41 0x3f 0xe7 0x3f 0xe2 0x4f
我expecting“Ç”为0xc7和“Ô0xc3,但转换结果为两个字节的字符(如utf8 ......)。
我做错了什么?
有没有办法做到这一点(jdk 1.6)?
答案 0 :(得分:1)
首先:String
没有编码。但是,正确执行的操作是在将文件作为文本读取时指定编码。
为了读取UTF-8中的文件然后将其转储为cp850:您可以这样做:
final Path path = Paths.get("file.txt");
try (
final BufferedReader reader = Files.newBufferedReader(path,
StandardCharsets.UTF_8);
) {
String line;
byte[] bytes;
while ((line = reader.readLine()) != null) {
bytes = line.getBytes(Charset.forName("cp850"));
// write this method
dumpBytes(bytes);
}
}