这个方法应该逐个字符地采用FileInputStream,它应该返回一个所有0和1的StringBuilder,它导致这个Huffman树中的那个字符。但是,我遇到了一些问题,它只返回每个路径的实例。
例如,如果我有树:
(10)
(4) (6)
(2=' ') (2) (3='a') (3='b')
(1=EOF) (1='c')
文件:
ab ab cab
会比预期更多返回1和0。我测试了我的构建树方法,它们似乎工作。但是,我假设我的递归方法compress()有问题。我相信这是因为当它到达一个不包含所需字符的叶子时,它仍然会返回一条到该叶子的路径的字符串。因此,它将比预期的返回更多。如果这是真的,那么如果它不匹配,如何消除该叶子的路径?
编辑:我整个周末一直都在努力,这就是我所拥有的:包含GUI的客户端代码非常长,所以我省略了它。我也省略了打印树的方法,因为这里已经有足够的代码了。
import java.io.*;
import java.util.*;
public class HuffmanTree {
public HuffmanNode overallRoot;
Map<Character, Integer> storage; // gets the repeating times of a number
ArrayList<HuffmanNode> nodes; // stores all nodes (will have only one node later)
// constructor
public HuffmanTree(Map<Character, Integer> counts) {
storage = counts; // sets the map to this one // putAll instead?
storage.put((char)4, 1); // put end of file character
storage = sortByValue(storage); // map is now sorted by values
nodes = storeNodes();
createTree();
}
// creates nodes from each key/value in storage map
private ArrayList<HuffmanNode> storeNodes() {
List<Character> characters = new ArrayList<Character>(storage.keySet());
ArrayList<HuffmanNode> output = new ArrayList<HuffmanNode>(); // stores all the nodes
for (Character i: characters) {
HuffmanNode temp = new HuffmanNode(storage.get(i), i);
output.add(temp);
}
return output; // output will be sorted by occurrences
}
// post: helper that sorts the map by value code
// Source: http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java
private static <Character, Integer extends Comparable<? super Integer>> Map<Character, Integer>
sortByValue( Map<Character, Integer> map ) {
List<Map.Entry<Character, Integer>> list =
new LinkedList<Map.Entry<Character, Integer>>( map.entrySet() );
Collections.sort( list, new Comparator<Map.Entry<Character, Integer>>() {
public int compare( Map.Entry<Character, Integer> o1, Map.Entry<Character, Integer> o2 ) {
return (o1.getValue()).compareTo( o2.getValue() );
}
} );
Map<Character, Integer> result = new LinkedHashMap<Character, Integer>();
for (Map.Entry<Character, Integer> entry : list) {
result.put( entry.getKey(), entry.getValue() );
}
return result;
}
// takes stuff from nodes and creates relationships with them
private void createTree() {
do { // keep running until nodes has only one elem
HuffmanNode first = nodes.get(0); // gets the first two nodes
HuffmanNode second = nodes.get(1);
HuffmanNode combined;
combined = new HuffmanNode(first.frequency + second.frequency); // combined huffman node
combined.left = first;
combined.right = second;
nodes.remove(0); // then remove the first two elems from list
nodes.remove(0);
// goes through and adds combined into right spot
boolean addAtEnd = true;
for (int i = 0; i < nodes.size(); i++) {
if (nodes.get(i).frequency > combined.frequency) {
nodes.add(i, combined);
addAtEnd = false; // already added; don't add at end
break;
}
} // need to add at end
if (addAtEnd) {
nodes.add(combined);
}
if (nodes.size() == 1) {
break;
}
} while (nodes.size() > 1);
}
// inputFile is a textFile // puts contents of file onto display window
// nodes need to be made first
// This is responsible for outputting 0's and 1's
public StringBuilder compress(InputStream inputFile) throws IOException {
StringBuilder result = new StringBuilder(); // stores resulting 1's and 0's
byte[] fileContent = new byte[20000000]; // creates a byte[]
inputFile.read(fileContent); // reads the input into fileContent
String storage = new String(fileContent); // contains entire file into this string to process
// need to exclude default value
String storage2 = ""; // keeps chars of value without default values
for (int i = 0; i < storage.length(); i++) {
if (storage.charAt(i) != '\u0000') {
storage2+=storage.charAt(i);
} else {
break;
}
}
for (int i = 0; i < storage2.length(); i++) { // goes through every char to get path
String binary = findPath(storage2.charAt(i));
result.append(binary); // add path to StringBuilder
}
return result;
}
// return a stringbuilder of binary sequence by reading each character, searching the
// tree then returning the path of 0's and 1's
private String findPath(char input) {
return findPath(input, nodes.get(0), "");
}
private String findPath(char input, HuffmanNode root, String path) {
String result = "";
if (!root.isLeaf()) {
result += findPath(input, root.left, path += "0"); // go left
result += findPath(input, root.right, path += "1"); // go right
} if (root.isLeaf()) { // base case If at right leaf
if (input == root.character) {
//System.out.println("found it");
return path;
}
}
return result;
}
}
以下是单个节点类:
import java.io.*;
import java.util.*;
// Stores each character, its number of occurrences, and connects to other nodes
public class HuffmanNode implements Comparable<HuffmanNode>{
public int frequency;
public char character;
public HuffmanNode left;
public HuffmanNode right;
// constructor for leaf
public HuffmanNode(int frequency, char character) {
this.frequency = frequency;
this.character = character;
left = null;
right = null;
}
// constructor for node w/ children
public HuffmanNode(int frequency) {
this.frequency = frequency;
left = null;
right = null;
}
// provides a count of characters in an input file and place in map
public static Map<Character, Integer> getCounts(FileInputStream input) throws IOException {
Map<Character, Integer> output = new TreeMap<Character, Integer>(); // treemap keeps keys in sorted order (chars alphabetized)
byte[] fileContent = new byte[2000000]; // creates a byte[]
//ArrayList<Byte> test = new ArrayList<Byte>();
input.read(fileContent); // reads the input into fileContent
String test = new String(fileContent); // contains entire file into this string to process
//System.out.println(test);
// goes through each character of String to put chars as keys and occurrences as keys
int i = 0;
char temp = test.charAt(i);
while (temp != '\u0000') { // while does not equal default value
if (output.containsKey(temp)) { // seen this character before; increase count
int count = output.get(temp);
output.put(temp, count + 1);
//System.out.println("repeat; char is: " + temp + "count is: " + output.get(temp)); // test
} else { // Haven't seen this character before; create count of 1
output.put(temp, 1);
//System.out.println("new; char is: " + temp + "count is: 1"); // test
}
i++;
temp = test.charAt(i);
}
return output;
}
// sees if this node is a leaf
public boolean isLeaf() {
if (left == null && right == null) {
return true;
}
return false;
}
@Override
public int compareTo(HuffmanNode o) {
// TODO Auto-generated method stub
return 0;
}
}
答案 0 :(得分:1)
方法private String findPath(char input, HuffmanNode root, String path)
是问题所在。您似乎尝试使用回溯来搜索树,但在返回时您永远不会取消堆栈。
一个简单的解决方案是为不正确字符的叶子返回null,以便只保留正确的路径:
private String findPath(char input, HuffmanNode root, String path) {
String result;
if (! root.isLeaf()) {
if ((result = findPath(input, root.left, path + '0')) == null) {
result = findPath(input, root.right, path + '1');
}
}
else {
result = (input == root.character) ? path : null;
}
return result;
}
使用您的示例进行测试,正确地提供了findPath('a', root, "") = "10"
和findPath('c', root, "") = "011"
但是您正在对从输入中读取的每个字符进行顺序搜索。恕我直言,首先创建一个散列,每个字符作为键,路径作为值更高效:
private Map<Character, String> genMap(HuffmanNode root) {
Map<Character, String> map = new HashMap<Character, String>();
huffmanTreeAdd(map, root, "");
return map;
}
private void huffmanTreeAdd(Map<Character, String> map, HuffmanNode root, String path) {
if (root.isLeaf()) {
map.put(root.character, path);
}
else {
huffmanTreeAdd(map, root.left, path + '0');
huffmanTreeAdd(map, root.right, path + '1');
}
}
通过哈希搜索而不是顺序搜索,您可以直接获得使用String path = map.get(c);
输入的每个字符的路径。
答案 1 :(得分:0)
我建议你在你拥有的两种不同类型的节点上使用多态。像interpreter design pattern这样的东西。它将有助于您的代码在可读性和效率方面:
abstract class HuffmanNode {
private String code;
public abstract int getFrequency();
public abstract String getCode();
public abstract void generateCodes(String code, Map<Character, String> codes);
...
}
class InternalNode {
private HuffmanNode left;
private HuffmanNode right;
...
public void generateCodes(String code, Map<Character, String> codes) {
left.generateCodes(code + "0", codes);
right.generateCodes(code + "1", codes);
}
...
}
class CharacterNode {
private int frequency;
private char character;
...
public void generateCodes(String code, Map<Character, String> codes) {
codes.put(character, code);
}
...
}
您可以像这样填充整个树的codes
(仅运行一次):
Map<Character, String> codes = new HashMap<>();
root.generateCodes("", codes);
然后使用codes.get(input)
代替findPath
。
您可以为其他方法(例如getFrequency()
)提供非常干净的递归实现。