Question

我遵循Rosetta Java code实施。

我尝试用我自己的词典进行LZW编码，而不是使用的ASCII词典。当我尝试使用自己的Dictioanry时，解码时会出现问题...结果是错误的，因为每个解码后的单词都不会查看第一个'a'字母。结果必须是'abraca abrac abra'而不是'braca brac bra'

我在String act = "" + (char)(int)compressed.remove(0);处的decode（）方法中看到了问题。这将删除所有第一个'a'字母。但我没有任何想法如何修改这一行...... 例如，如果我使用String act = "";而不是上面的行...编码将是非常错误的，或使用另一个命令...我不知道如何解决这个小问题...或者我也许我正在寻找解决方案的坏方法。

public class LZW {  


public static List<Integer> encode(String uncompressed) {

    Map<String,Integer> dictionary = DictionaryInitStringInt();        
    int dictSize = dictionary.size();

    String act = "";
    List<Integer> result = new ArrayList<Integer>();

    for (char c : uncompressed.toCharArray()) {
        String next = act + c;
        if (dictionary.containsKey(next))
            act = next;
        else {
            result.add(dictionary.get(act));
            // Add next to the dictionary.
            dictionary.put(next, dictSize++);
            act = "" + c;
        }
    }

    // Output the code for act.
    if (!act.equals(""))
        result.add(dictionary.get(act));
    return result;
} 

public static String decode(List<Integer> compressed) {

    Map<Integer,String> dictionary = DictionaryInitIntString();        
    int dictSize = dictionary.size();

    String act = "" + (char)(int)compressed.remove(0);
    //String act = "";
    String result = act;

    for (int k : compressed) {            
        String entry;
        if (dictionary.containsKey(k))
            entry = dictionary.get(k);
        else if (k == dictSize)
            entry = act + act.charAt(0);
        else
            throw new IllegalArgumentException("Nincs ilyen kulcs: " + k);

        result += entry;

        dictionary.put(dictSize++, act + entry.charAt(0));

        act = entry;
    }
    return result;
}

public static Map<String,Integer> DictionaryInitStringInt()
{               
    char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
                    'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
                    '?','.',','};
    int charactersLength = characters.length;

    Map<String,Integer> dictionary = new HashMap<String,Integer>();

    for (int i = 0; i < charactersLength; i++)
            dictionary.put("" + characters[i], i); 

    return dictionary;
}

public static Map<Integer,String> DictionaryInitIntString()
{               
    char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
                    'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
                    '?','.',','};
    int charactersLength = characters.length;

    Map<Integer,String> dictionary = new HashMap<Integer,String>();

    for (int i = 0; i < charactersLength; i++)
            dictionary.put(i,"" + characters[i]); 

    return dictionary;
}

public static void main(String[] args) {

    List<Integer> compressed = encode("abraca abrac abra");
    System.out.println(compressed);

    String decodeed = decode(compressed);
    // decodeed will be 'braca brac bra'
    System.out.println(decodeed);
}

}

Answer 1

rosetta示例使用

"" + (char) (int) compressed.remove(0);

因为字典映射的前256个条目恰好是＆＃39; char＆＃39;值。

使用自定义词典时，此行应为：

String act = dictionary.get(compressed.remove(0));

LZW解码错过了第一个代码条目

1 个答案: