我正在编写一个代码来计算具有shannon熵的字符串的熵。
Dim entropytext As String = Result.Text
Dim theresult = entropytext.GroupBy(Function(o) o) _
.Select(Function(o) New With {.Count = o.Count(), .Character = o.Key}) _
.GroupBy(Function(o) o.Count, Function(o) o.Character) _
.OrderByDescending(Function(o) o.Key)
Dim totalEntropy As Double = 0
Dim partialEntropy As Double
Dim partialP As Double
For Each item In theresult
Console.Write(item.Key & " of chars: ")
For Each character In item
Console.Write(character)
Next
partialP = item.Key / entropytext.Count
Console.Write(". p of each " & partialP & ", total p = " & item.Count * partialP)
partialEntropy = partialP * Math.Log(partialP) * item.Count
totalEntropy += partialEntropy
Console.WriteLine()
Next
totalEntropy *= -1
TextBox1.Text = totalEntropy & " Bits"
End Sub
数学:
Entropy = -∑(P_xlog(P_x))
P_x = N_x/∑(N_x)
其中 P_x 是字母 x 的概率,
和 N_x 是字母 x 的数量。
所以,
textbox1 ='AATC'
Entropy (textbox1)=-([2/4 log(2/4)]+[1/4 log (1/4)]+[1/4 log (1/4)])
= 1.0397
基本上,它应该像这样工作......但我不能胜任尖锐的......
public static double ShannonEntropy(string s)
{
var map = new Dictionary<char, int>();
foreach (char c in s)
{
if (!map.ContainsKey(c))
map.Add(c, 1);
else
map[c] += 1;
}
double result = 0.0;
int len = s.Length;
foreach (var item in map)
{
var frequency = (double)item.Value / len;
result -= frequency * (Math.Log(frequency) / Math.Log(2));
}
return result;
}
答案 0 :(得分:1)
这是C#代码到VB.NET的直接端口:
Public Shared Function ShannonEntropy(s As String) As Double
Dim map = New Dictionary(Of Char, Integer)()
For Each c As Char In s
If Not map.ContainsKey(c) Then
map.Add(c, 1)
Else
map(c) += 1
End If
Next
Dim result As Double = 0.0
Dim len As Integer = s.Length
For Each item As var In map
Dim frequency = CDbl(item.Value) / len
result -= frequency * (Math.Log(frequency) / Math.Log(2))
Next
Return result
End Function
如果C#代码产生了您要查找的结果,则此代码将给出相同的结果。