Question

这个方法应该逐个字符地采用FileInputStream，它应该返回一个所有0和1的StringBuilder，它导致这个Huffman树中的那个字符。但是，我遇到了一些问题，它只返回每个路径的实例。

例如，如果我有树：

                             (10)

            (4)                               (6)

  (2=' ')             (2)                 (3='a') (3='b')

                (1=EOF)  (1='c')

文件：

ab ab cab

会比预期更多返回1和0。我测试了我的构建树方法，它们似乎工作。但是，我假设我的递归方法compress（）有问题。我相信这是因为当它到达一个不包含所需字符的叶子时，它仍然会返回一条到该叶子的路径的字符串。因此，它将比预期的返回更多。如果这是真的，那么如果它不匹配，如何消除该叶子的路径？

编辑：我整个周末一直都在努力，这就是我所拥有的：包含GUI的客户端代码非常长，所以我省略了它。我也省略了打印树的方法，因为这里已经有足够的代码了。

import java.io.*;
import java.util.*;

    public class HuffmanTree {
        public HuffmanNode overallRoot;
        Map<Character, Integer> storage; // gets the repeating times of a number
        ArrayList<HuffmanNode> nodes; // stores all nodes (will have only one node later)

        // constructor
        public HuffmanTree(Map<Character, Integer> counts) {
            storage = counts; // sets the map to this one // putAll instead?
            storage.put((char)4, 1); // put end of file character
            storage = sortByValue(storage); // map is now sorted by values
            nodes = storeNodes();
            createTree();
        }

        // creates nodes from each key/value in storage map
        private ArrayList<HuffmanNode> storeNodes() {
            List<Character> characters = new ArrayList<Character>(storage.keySet());
            ArrayList<HuffmanNode> output = new ArrayList<HuffmanNode>(); // stores all the nodes
            for (Character i: characters) {
                HuffmanNode temp = new HuffmanNode(storage.get(i), i);
                output.add(temp);
            }   
            return output; // output will be sorted by occurrences
        }

        // post: helper that sorts the map by value code 
        // Source: http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java
        private static <Character, Integer extends Comparable<? super Integer>> Map<Character, Integer> 
            sortByValue( Map<Character, Integer> map ) {

            List<Map.Entry<Character, Integer>> list =
            new LinkedList<Map.Entry<Character, Integer>>( map.entrySet() );
            Collections.sort( list, new Comparator<Map.Entry<Character, Integer>>() {
                public int compare( Map.Entry<Character, Integer> o1, Map.Entry<Character, Integer> o2 ) {
                return (o1.getValue()).compareTo( o2.getValue() );
                }

            } );

            Map<Character, Integer> result = new LinkedHashMap<Character, Integer>();
            for (Map.Entry<Character, Integer> entry : list) {
                result.put( entry.getKey(), entry.getValue() );
            }
            return result;
        }

        // takes stuff from nodes and creates relationships with them
        private void createTree() {
            do { // keep running until nodes has only one elem
                HuffmanNode first = nodes.get(0); // gets the first two nodes
                HuffmanNode second = nodes.get(1);
                HuffmanNode combined;

                combined = new HuffmanNode(first.frequency + second.frequency); // combined huffman node
                combined.left = first;
                combined.right = second;

                nodes.remove(0); // then remove the first two elems from list
                nodes.remove(0);

                // goes through and adds combined into right spot
                boolean addAtEnd = true;
                for (int i = 0; i < nodes.size(); i++) {
                    if (nodes.get(i).frequency > combined.frequency) {
                        nodes.add(i, combined);
                        addAtEnd = false; // already added; don't add at end
                        break;
                    }
                } // need to add at end 
                if (addAtEnd) {
                    nodes.add(combined);
                }
                if (nodes.size() == 1) {
                    break;
                }

            } while (nodes.size() > 1);
        }

        // inputFile is a textFile // puts contents of file onto display window
        // nodes need to be made first
        // This is responsible for outputting 0's and 1's
        public StringBuilder compress(InputStream inputFile) throws IOException {
            StringBuilder result = new StringBuilder(); // stores resulting 1's and 0's 
            byte[] fileContent = new byte[20000000]; // creates a byte[]
            inputFile.read(fileContent);                // reads the input into fileContent
            String storage = new String(fileContent);  // contains entire file into this string to process

            // need to exclude default value
            String storage2 = ""; // keeps chars of value without default values
            for (int i = 0; i < storage.length(); i++) {
                if (storage.charAt(i) != '\u0000') {
                    storage2+=storage.charAt(i);
                } else {
                    break;
                }
            }

            for (int i = 0; i < storage2.length(); i++) { // goes through every char to get path
                String binary = findPath(storage2.charAt(i));
                result.append(binary); // add path to StringBuilder
            }
            return result;  
        }

        // return a stringbuilder of binary sequence by reading each character, searching the
        // tree then returning the path of 0's and 1's
        private String findPath(char input) {
            return findPath(input, nodes.get(0), "");
        }

        private String findPath(char input, HuffmanNode root, String path) {
            String result = "";
            if (!root.isLeaf()) {
                result += findPath(input, root.left, path += "0"); // go left
                result += findPath(input, root.right, path += "1"); // go right
            } if (root.isLeaf()) { // base case If at right leaf
                if (input == root.character) {
                    //System.out.println("found it");
                    return path;
                }
            }   
            return result;
        }
    }

以下是单个节点类：

import java.io.*;
import java.util.*;

// Stores each character, its number of occurrences, and connects to other nodes
public class HuffmanNode implements Comparable<HuffmanNode>{
    public int frequency;
    public char character;
    public HuffmanNode left;
    public HuffmanNode right;


    // constructor for leaf
    public HuffmanNode(int frequency, char character) {
        this.frequency = frequency;
        this.character = character;
        left = null;
        right = null;
    }

    // constructor for node w/ children
    public HuffmanNode(int frequency) {
        this.frequency = frequency;
        left = null;
        right = null;
    }

    // provides a count of characters in an input file and place in map
    public static Map<Character, Integer> getCounts(FileInputStream input) throws IOException {
        Map<Character, Integer> output = new TreeMap<Character, Integer>(); // treemap keeps keys in sorted order (chars alphabetized)
        byte[] fileContent = new byte[2000000]; // creates a byte[]
        //ArrayList<Byte> test = new ArrayList<Byte>();
        input.read(fileContent);                // reads the input into fileContent
        String test = new String(fileContent);  // contains entire file into this string to process
        //System.out.println(test);

        // goes through each character of String to put chars as keys and occurrences as keys
        int i = 0;
        char temp = test.charAt(i);
        while (temp != '\u0000') { // while does not equal default value
            if (output.containsKey(temp)) { // seen this character before; increase count

                int count = output.get(temp);
                output.put(temp, count + 1);
                //System.out.println("repeat; char is: " + temp + "count is: " + output.get(temp)); // test
            } else {                        // Haven't seen this character before; create count of 1    
                output.put(temp, 1);
                //System.out.println("new; char is: " + temp + "count is: 1"); // test
            }
            i++;
            temp = test.charAt(i);
        }
        return output;
    }

    // sees if this node is a leaf
    public boolean isLeaf() {
        if (left == null && right == null) {
            return true;
        }
        return false;
    }

    @Override
    public int compareTo(HuffmanNode o) {
        // TODO Auto-generated method stub
        return 0;
    }
}

Answer 1

方法private String findPath(char input, HuffmanNode root, String path)是问题所在。您似乎尝试使用回溯来搜索树，但在返回时您永远不会取消堆栈。

一个简单的解决方案是为不正确字符的叶子返回null，以便只保留正确的路径：

private String findPath(char input, HuffmanNode root, String path) {
    String result;
    if (! root.isLeaf()) {
        if ((result = findPath(input, root.left, path + '0')) == null) {
            result = findPath(input, root.right, path + '1');
        }
    }
    else {
        result = (input == root.character) ? path : null;
    }
    return result;
}

使用您的示例进行测试，正确地提供了findPath('a', root, "") = "10"和findPath('c', root, "") = "011"

但是您正在对从输入中读取的每个字符进行顺序搜索。恕我直言，首先创建一个散列，每个字符作为键，路径作为值更高效：

private Map<Character, String> genMap(HuffmanNode root) {
    Map<Character, String> map = new HashMap<Character, String>();
    huffmanTreeAdd(map, root, "");
    return map;
}

private void huffmanTreeAdd(Map<Character, String> map, HuffmanNode root, String path) {
    if (root.isLeaf()) {
        map.put(root.character, path);
    }
    else {
        huffmanTreeAdd(map, root.left, path + '0');
        huffmanTreeAdd(map, root.right, path + '1');
    }
}

通过哈希搜索而不是顺序搜索，您可以直接获得使用String path = map.get(c);输入的每个字符的路径。

Answer 2

我建议你在你拥有的两种不同类型的节点上使用多态。像interpreter design pattern这样的东西。它将有助于您的代码在可读性和效率方面：

abstract class HuffmanNode {
  private String code;
  public abstract int getFrequency();
  public abstract String getCode();

  public abstract void generateCodes(String code, Map<Character, String> codes);
  ...
}

class InternalNode {
  private HuffmanNode left;
  private HuffmanNode right;
  ...
  public void generateCodes(String code, Map<Character, String> codes) {
    left.generateCodes(code + "0", codes);
    right.generateCodes(code + "1", codes);
  }
  ...
}

class CharacterNode {
  private int frequency;
  private char character;
  ...
  public void generateCodes(String code, Map<Character, String> codes) {
    codes.put(character, code);
  }
  ...
}

您可以像这样填充整个树的codes（仅运行一次）：

Map<Character, String> codes = new HashMap<>();
root.generateCodes("", codes);

然后使用codes.get(input)代替findPath。您可以为其他方法（例如getFrequency()）提供非常干净的递归实现。

打印霍夫曼树中的正确路径

2 个答案: