如何在VBNET中实现murmurhash3

时间:2012-03-10 18:25:19

标签: vb.net bytearray hash bit-manipulation c#-to-vb.net

我正在尝试在vb.net中实现murmurhash3并尝试转换为此C# implementation

c#

中函数的第一部分
public static SqlInt32 MurmurHash3(SqlBinary data)
{
const UInt32 c1 = 0xcc9e2d51;
const UInt32 c2 = 0x1b873593;

int curLength = data.Length;    /* Current position in byte array */
int length = curLength;   /* the const length we need to fix tail */
UInt32 h1 = seed;
UInt32 k1 = 0;

/* body, eat stream a 32-bit int at a time */
Int32 currentIndex = 0;
while (curLength >= 4)
{
  /* Get four bytes from the input into an UInt32 */
  k1 = (UInt32)(data[currentIndex++]
    | data[currentIndex++] << 8
    | data[currentIndex++] << 16
    | data[currentIndex++] << 24);

  /* bitmagic hash */
  k1 *= c1;
  k1 = rotl32(k1, 15);
  k1 *= c2;

  h1 ^= k1;
  h1 = rotl32(h1, 13);
  h1 = h1 * 5 + 0xe6546b64;
  curLength -= 4;
}

在VB.net中也一样:

     Public Shared Function MurmurHash3(data As Byte()) As Int32
    Const c1 As UInt32 = &HCC9E2D51UI
    Const c2 As UInt32 = &H1B873593

    Dim curLength As Integer = data.Length
    ' Current position in byte array 
    Dim length As Integer = curLength
    ' the const length we need to fix tail 
    Dim h1 As UInt32 = seed
    Dim k1 As UInt32 = 0

    ' body, eat stream a 32-bit int at a time 
    Dim dBytes As Byte()
    Dim currentIndex As Int32 = 0
    While curLength >= 4
        ' Get four bytes from the input into an UInt32 
        dBytes = New Byte() {data(currentIndex), data(currentIndex + 1), data(currentIndex + 2), data(currentIndex + 3)}
        k1 = BitConverter.ToUInt32(dBytes, 0)

        currentIndex += 4
        ' bitmagic hash 

        k1 *= c1
        k1 = rotl32(k1, 15)
        k1 *= c2

        h1 = h1 Xor k1
        h1 = rotl32(h1, 13)
        h1 = h1 * 5 + &HE6546B64UI
        curLength -= 4
    End While

Private Shared Function rotl32(x As UInt32, r As Byte) As UInt32
    Return (x << r) Or (x >> (32 - r))
End Function

k1 * = c1 引发错误算术运算导致溢出。

有关如何实施此建议的任何建议?我不知道如果这是问题,如何将输入中的四个字节输入UInt32部分,或者它与其他内容有关,因为C#和VB之间的按位运算存在一些差异。

对于参考,Java实现也存在 https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java

2 个答案:

答案 0 :(得分:1)

我首先将32位k1转换为64位变体,例如:

k1_64 = CType(k1, UInt64)

对于模32位计算,执行

k1_64 = (k1_64 * c1) And &HFFFFFFFFUI

最后,重新回到32位

k1 = CType(k1_64 And $HFFFFFFFFUI, UInt32)

要添加更多性能,您可能需要考虑用其他内容替换BitConverter.ToUInt调用。

编辑:这是一个更简单的版本,没有额外的变量(但是有一个'帮助常量')

Const LOW_32 as UInt32 = &HFFFFFFFFUI
' ... intervening code ...
k1 = (1L * k1 * c1) And LOW_32
' ... later on ...
h1 = (h1 * 5L + &HE6546B64UL) And LOW_32

1L迫使parens中的计算执行为Long(Int64)。 And LOW_32将非零位数减少到32,然后整体结果自动转换为UInt32。类似的事情发生在h1行。

参考:http://www.undermyhat.org/blog/2009/08/secrets-and-lies-of-type-suffixes-in-c-and-vb-net/(向下滚动到“常量和输入后缀的秘密”部分)

答案 1 :(得分:0)

不幸的是,在VB.NET中可以做相当于未经检查的{}吗?您可以使用try / catch阻止,并在溢出时手动执行移位。请注意,在那里放置一个错误处理程序会减慢哈希计算。