如何在java中将arabic char转换为hexstring

时间:2012-08-04 18:12:45

标签: java string character-encoding char arabic

??????具有阿拉伯字符串时,以下代码将在str中作为输出返回:

String str="مرحبا",str2="";
for (int i = 0; i < str.length(); ++i) {
                str2 += displayChar(str.charAt(reorder[i]));
                System.out.print(reorder[i]);
            }
   System.out.println(str2); // output is : ?????

和:

String displayChar(char c) {
        if (c < '\u0010') {
            return "0x0" + Integer.toHexString(c);
        } else if (c < '\u0020' || c >= '\u007f') {
            return "0x" + Integer.toHexString(c);
        } else {
            return c+"";
        }
    }

有关     reorder integer数组仅包含给定str

中字符的新索引(顺序)
Here is the complete code, .. hope it will help you to understand the problem :
/*
 * (C) Copyright IBM Corp. 1999, All Rights Reserved
 *
 * version 1.0
 */

import java.io.*;

/**
 * A simple command-line interface to the BidiReference class.
 * <p>
 * This prompts the user for an ASCII string, runs the reference
 * algorithm on the string, and displays the results to the terminal.
 * An empty return to the prompt exits the program.
 * <p>
 * ASCII characters are preassigned various bidi direction types. 
 * These types can be displayed by the user for reference by
 * typing <code>-display</code> at the prompt.  More help can be
 * obtained by typing <code>-help</code> at the prompt.
 */
public class BidiReferenceTest {
    BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
    PrintWriter writer = new PrintWriter(new BufferedOutputStream(System.out));
    BidiReferenceTestCharmap charmap = BidiReferenceTestCharmap.TEST_ARABIC;
    byte baseDirection = -1;

    /**
     * Run the interactive test.
     */
    public static void main(String args[]) {
        new BidiReferenceTest().run();
    }

    void run() {
        //printHelp();

        while (true) {
            writer.print("> ");
            writer.flush();
            String input;
            try {
                input = reader.readLine();
            }
            catch (Exception e) {
                writer.println(e);
                continue;
            }

            if (input.length() == 0) {
                writer.println("Bye!");
                writer.flush();
                return;
            }

            if (input.charAt(0) == '-') { // command
                int limit = input.indexOf(' ');
                if (limit == -1) {
                    limit = input.length();
                }
                String cmd = input.substring(0, limit);
                if (cmd.equals("-display")) {
                    charmap.dumpInfo(writer);
                } else if (cmd.equals("-english")) {
                    charmap = BidiReferenceTestCharmap.TEST_ENGLISH;
                    charmap.dumpInfo(writer);
                } else if (cmd.equals("-hebrew")) {
                    charmap = BidiReferenceTestCharmap.TEST_HEBREW;
                    charmap.dumpInfo(writer);
                } else if (cmd.equals("-arabic")) {
                    charmap = BidiReferenceTestCharmap.TEST_ARABIC;
                    charmap.dumpInfo(writer);
                } else if (cmd.equals("-mixed")) {
                    charmap = BidiReferenceTestCharmap.TEST_MIXED;
                    charmap.dumpInfo(writer);
                } else if (cmd.equals("-baseLTR")) {
                    baseDirection = 0;
                } else if (cmd.equals("-baseRTL")) {
                    baseDirection = 1;
                } else if (cmd.equals("-baseDefault")) {
                    baseDirection = -1;
                } else {
                }
            } else {

                String ss= runSample(input);
                System.out.println(ss);
                Character.UnicodeBlock block =  Character.UnicodeBlock.of(Character.codePointAt(ss, 0));

            }
        }
    }



    String runSample(String str) {
        String str2 = "";
        try {
            charmap = BidiReferenceTestCharmap.TEST_ARABIC;

            byte[] codes = charmap.getCodes(str);
            baseDirection = 1;
            BidiReference bidi = new BidiReference(codes, baseDirection); // baseDirection = 1
            int[] reorder = bidi.getReordering(new int[] { codes.length });
            /*
            writer.println("base level: " + bidi.getBaseLevel() + (baseDirection != -1 ? " (forced)" : ""));

            // output original text
            for (int i = 0; i < str.length(); ++i) {
                displayChar(str.charAt(i));
            }
            writer.println();
             */
            // output visually ordered text
            for (int i = 0; i < str.length(); ++i) {
                str2 += displayChar(str.charAt(reorder[i]));
                System.out.print(reorder[i]);
            }
            return str2;
        }
        catch (Exception e) {
            return "";
        }
    }

    String displayChar(char c) {
        if (c < '\u0010') {
            return "0x0" + Integer.toHexString(c);
        } else if (c < '\u0020' || c >= '\u007f') {
            return "0x" + Integer.toHexString(c);
        } else {
            return c+"";
        }
    }
}

2 个答案:

答案 0 :(得分:0)

一个问题是你的终端可能没有正确支持Unicode字符(这可能不是唯一的问题)。

答案 1 :(得分:0)

如果我猜测我会说你使用默认控制台设置(即Raster字体)在Windows下运行,并且你从控制台而不是在Eclipse中运行Java程序。

如果是这种情况,那么只需更改控制台设置以使用TrueType字体(Lucida控制台或Consolas),您应该看到框而不是问号。那些看起来也不正确,但至少它是实际文本而不是问号。

附注:如果 支持Unicode但是将其转换为某处的其他编码,例如,问号是常见的。拉丁文1。