Question

我正在尝试构建一个解码器，例如

a           100100
b           100101
c           110001
d           100000
[newline]   111111
p           111110
q           000001

编码位：

111110000001100100111111100101110001111110

这应该给出

pqa
bcp

我试过的代码：

static string decode(string[] codes, string encoded) {
            string finalString = string.Empty;
            string temp = string.Empty;
            System.Collections.Hashtable hashtable = new System.Collections.Hashtable();

            for (int i = 0; i < codes.Length; i++)
            {
                if (codes[i].Split()[0].Contains("new"))
                {
                    var newLine = Environment.NewLine;
                    hashtable.Add(codes[i].Split()[1], newLine);
                    continue;
                }
                hashtable.Add(codes[i].Split()[1], codes[i].Split()[0]);
            }

            for (int i = 0; i <= encoded.Length; i++)
            {
                if (codes.Length!=finalString.Length)
                {
                    temp = temp + encoded[i];
                    if (hashtable[temp] != null)
                    {
                        finalString = finalString + hashtable[temp].ToString();
                        encoded = encoded.Remove(0, temp.Length);
                        temp = string.Empty;
                        i = -1;
                    }
                }
            }
            return finalString;

    }

但大多数测试用例似乎都失败了。任何人都可以指出我做的错误吗？而且无论如何要改善上面代码的性能？

还有什么方法可以更快地解码？我的意思是编码的字符串可以是7000长度，代码可以达到100.它花了很多时间来执行这些。我研究了霍夫曼解码。可以用来固定吗？

提前致谢

Answer 1

如果我运行您的代码，请提供如下参数：

string[] str = new string[] { "a 100100","b 100101","c 110001","d 100000","[newline] 111111","p 111110","q 000001" };
string encoded = @"111110000001100100111111100101110001111110";
string res = decode(str, encoded);

结果是"pqa\r\nbc"。那距离预期的结果并不远，只是错过了最后一个角色。

您可以通过将第二个for循环简化为：

来解决此问题

for (int i = 0; i < encoded.Length; i++)
{
    temp = temp + encoded[i];
    if (hashtable[temp] != null)
    {
        finalString = finalString + hashtable[temp].ToString();
        encoded = encoded.Remove(0, temp.Length);
        temp = string.Empty;
        i = -1;
    }
}

我删除了if，因为我不理解它：P并注意i < encoded.Length而不是i <= encoded.Length，否则会导致异常。

有关您的实施的一些注意事项：

而不是Hashtable您可以使用Dictionary<string, string>和dictionary.TryGetValue(...)来测试它是否包含某个字符串
如果“代码”的长度始终相同，则可以简化代码，而不必分别迭代每个字符
将codes[i].Split()的结果保存到变量中，而不是多次重复

解码位以找到字符串 - 请求霍夫曼技术

1 个答案: