我想生成一个输出频率和代码字的表格,用于dna字母..我应该得到的输出是:
Symbol: T Codeword: 000 Freq: 10
Symbol: T Codeword: 001 Freq: 15
Symbol: T Codeword: 01 Freq: 25
Symbol: T Codeword: 1 Freq: 50
Average VLC codeword length: 1.75 bits per symbol
Average fixed length codeword length: 2 bits per symbol
然而,对于平均VLC码字长度I得到一个长十进制数,固定码字长度也是如此。加上固定长度应该大于VLC,但我的相反。我认为我正在实现日志代码错误,但我究竟做错了什么?这是代码:
def main():
dnaData = readFile()
dataSymbol = symbol(dnaData)
node = symNode(dataSymbol)
heap = mkHeap(len(node), compareFunc)
dataCollectNode(node, heap)
while heap.size > 1:
n1 = removeMin(heap)
n2 = removeMin(heap)
for element in n1.symbol:
element.code = ('0' + element.code)
for element in n2.symbol:
element.code = ('1' + element.code)
newNode = mkNode((n1.cumFreq+n2.cumFreq),(n1.symbol + n2.symbol))
add(heap, newNode)
print("Variable length code output...")
print("---------------------------------------")
total_different_symbols = 0
heapNode = top(heap)
for element in heapNode.symbol:
print("Symbol: %2s " % element.name, end ='')
print("Codeword: %8s " % element.code, end ='')
print("Frequency: %5d " % element.freq)
temp = int(element.freq)*len(element.code)
total_different_symbols += temp
total_different_symbols = total_different_symbols / heapNode.cumFreq
print("Average VLC codeword length: ", total_different_symbols, " bits per symbols")
average_fixed_length_codeword = log(total_different_symbols)
print("Average fixed length codeword length: ", average_fixed_length_codeword, " bits per symbol")
任何提示?
答案 0 :(得分:0)
知道了,我必须将其更改为log(total_different_symbols, 2)