霍夫曼解码错误的字节

时间:2014-03-13 20:39:10

标签: java byte decode decoding huffman-code

我尝试了很多不同的搜索和论坛,我去教授办公时间。但他跟我说了5分钟,并建议不要做这个额外的信用分配。

Huffman algorithms pretty straight forward. But decoding is a bit difficult.

    public class DecodeMain {

    public static void main(String[] args) throws IOException {

        FileReader in = null;
        String codesFileName = "codes.txt";

        Map<String, Character> bin_string_map = new HashMap<String, Character>();

        try {
            in = new FileReader(codesFileName);
            int c;
            StringBuilder message = new StringBuilder();

            // read characters from the file into a string
            while ((c = in.read()) != -1) {
                message.append((char) c);
            }
            in.close();
            // split string by comma + space
            String[] splitted = message.toString().split(", ");

            // for all except 1st and lastS
            for (int i = 1; i < splitted.length - 1; i++) {
                // key substring after '=' and value first char
                bin_string_map.put(splitted[i].substring(2),
                        splitted[i].charAt(0));
            }
            bin_string_map.put(splitted[0].substring(3), splitted[0].charAt(1));
            bin_string_map.put(splitted[splitted.length - 1].substring(2),
                    splitted[0].charAt(0));

        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());

        }

        Path path = Paths.get("compressed.txt");
        byte[] data = Files.readAllBytes(path);

        StringBuilder sb = new StringBuilder();
        // for (int i = 0; i < data.length; i++) {
        // sb.append(Integer.toBinaryString(data[i]));
        // }
        System.out.println(Integer.toBinaryString(data[0]));
        System.out.println(bin_string_map.get("1011101"));
        // T=101100101
        // h=0011
        // e=000

        // byte[0] 1011101
        // byte[1] 11111111111111111111111110110011
        // byte[2] 110111
        // System.out.println(sb.toString());

        // StringBuilder text = new StringBuilder();
        // int j = 0;
        // for (int i = 0; i < sb.toString().length(); i++) {
        // String key = sb.toString().substring(j, i);
        // if (bin_string_map.containsKey(key)) {
        // text.append(bin_string_map.get(key));
        // j = i;
        // }
        // }
        // System.out.println(text.toString());

    }
}

问题是,当我在这里得到我的compressed.txt文件时,前面是3个字节:

  

]³7

但是我的book.txt以“The”开头,对应不同的位字符串:

// T=101100101
// h=0011
// e=000

but bytes are:
// byte[0] 1011101
// byte[1] 11111111111111111111111110110011
// byte[2] 110111

0 个答案:

没有答案