汉明重量/人口数量在T-SQL中

时间:2011-05-06 20:01:29

标签: sql sql-server binary population hammingweight

我正在寻找快速方法来计算BINARY(1024)字段的汉明重量/人口数/“1位数”。 MySQL有一个BIT_COUNT函数可以做类似的事情。我在T-SQL中找不到类似的功能?

或者您是否建议将二进制数据存储在另一种类型的字段中?

如果您不知道我在说什么,请点击Wikipedia article about the hamming weight

4 个答案:

答案 0 :(得分:4)

您可以使用具有预先计算的汉明权重的辅助表来获取小数字(如字节),然后相应地拆分值,连接到辅助表并获得部分汉明权重的总和作为值的汉明重量:

-- define Hamming weight helper table
DECLARE @hwtally TABLE (byte tinyint, hw int);
INSERT INTO @hwtally (byte, hw) VALUES (0, 0);
INSERT INTO @hwtally (byte, hw) SELECT   1 - byte, 1 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT   3 - byte, 2 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT   7 - byte, 3 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT  15 - byte, 4 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT  31 - byte, 5 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT  63 - byte, 6 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT 127 - byte, 7 - hw FROM @hwtally;
INSERT INTO @hwtally (byte, hw) SELECT 255 - byte, 8 - hw FROM @hwtally;

-- calculate
WITH split AS (
  SELECT SUBSTRING(@value, number, 1) AS byte
  FROM master.dbo.spt_values
  WHERE type = 'P' AND number BETWEEN 1 AND LEN(@value)
)
SELECT
  Value = @value,
  HammingWeight = SUM(t.hw)
FROM split s
  INNER JOIN @hwtally t ON s.byte = t.byte

答案 1 :(得分:1)

当您使用较小的值(最大值为16位)时,使用SQL Server执行此操作的最有效方法是使用一个表,计算所有结果并使用连接。

通过在查询上执行此类操作,我可以将查询从30秒加速到0秒,该查询应计算17&000; 000行上4位值的汉明重量。

WITH HammingWeightHelper AS (
        SELECT  x, Fx 
        FROM (VALUES(0,0),(1,1),(2,1),(3,2),
                    (4,1),(5,2),(6,2),(7,3),
                    (8,1),(9,2),(10,2),(11,3),
                    (12,2),(13,3),(14,3),(15,4)) AS HammingWeight(x, Fx)
    )
SELECT HammingWeight.Fx As HammingWeight, SomeTable.Value As bitField
FROM   SomeTable INNER JOIN
       HammingWeightHelper ON HammingWeightHelper.x = SomeTable.Value 

当然这是一个丑陋的解决方案,它可能不适合长位场。

答案 2 :(得分:0)

没有找到关于汉明重量的具体内容,但这里有一个汉明距离:

create function HamDist(@value1 char(8000), @value2 char(8000))
returns int
as
begin
    declare @distance int
    declare @i int
    declare @len int

    select @distance = 0,
           @i =1,
           @len = case when len(@value1) > len(@value2)
                       then len(@value1)
                       else len(@value2) end

    if (@value1 is null) or (@value2 is null)
        return null

    while (@i <= @len)
        select @distance = @distance +
                           case when substring(@value1,@i,1) != substring(@value2,@i,1)
                                then 1
                                else 0 end,
               @i = @i +1

    return @distance
end

这计算两个值之间的汉明距离。单个值的汉明权重将是该值与零值数组之间的汉明距离。

答案 3 :(得分:0)

我找不到一个好办法。最后,我用Java计算了汉明重量,并定期更新数据库中的位数。