SQL Server Having子句未按预期过滤浮点数

时间:2017-07-24 18:32:26

标签: sql sql-server

我在这里有一个相当简单的子查询。目的是识别记录少于40小时的员工,持续数周的数据集。但每隔一段时间,一名表现出40个小时的员工就会潜入视线。它背后的数据使用了高精度的浮点数,我尽力将每个单独的条目首先舍入到两个小数位,然后将按员工分组的所有条目求和。我甚至试过< 39.999我还有一个40小时的员工。事实上,如果按原样获取数据,该员工实际上有40.0000002980232 - 超过40,所以我怀疑这个问题与舍入错误无关,而与语法有关。任何人都可以弄清楚为什么这个语句会引入违反HAVING子句的行?

    create table #thisweek ([hours] float, time_user varchar(100))

insert into #thisweek values
(1.58000004291534,  'john.doe'),
(4.32000017166138,  'john.doe'),
(0.620000004768372, 'john.doe'),
(1, 'john.doe'),
(0.680000007152557, 'john.doe'),
(2, 'john.doe'),
(2, 'john.doe'),
(3, 'john.doe'),
(0.790000021457672, 'john.doe'),
(3, 'john.doe'),
(2, 'john.doe'),
(1, 'john.doe'),
(3.32999992370605,  'john.doe'),
(2, 'john.doe'),
(4.42000007629395,  'john.doe'),
(1.33000004291534,  'john.doe'),
(0.579999983310699, 'john.doe'),
(3.29999995231628,  'john.doe'),
(2.1800000667572,   'john.doe'),
(0.620000004768372, 'john.doe'),
(0.25,  'john.doe')

select sum(hours) from #thisweek group by time_user
/* 40.0000002980232 */
/* Test: Nothing should come up since employee John Doe has 40 hours */
select sum(Round(hours,2)) hours /*<--- this is the same value as below*/, time_user

from

#thisweek

group by time_user
having sum(Round(hours,2)) < 40  /* how is 40 < 40? */

6 个答案:

答案 0 :(得分:2)

@Gordon Linoff所述,使用float几小时似乎不合适,但如果必须使用float,则需要避免使用Round函数,因为数据类型为使用float的返回值。将值舍入为2位小数的等效方法是cast( hours * 100 + .5 as int) / 100.00,因此您的查询可以是:

select sum(cast( hours * 100 + .5  as int) / 100.00) as hours, time_user
from #thisweek
group by time_user
having sum(hours) < 40

答案 1 :(得分:2)

当工具返回'40'时,这些工具会对你撒谎。转换样式2返回16位数字:

SELECT CONVERT(VARCHAR(100), SUM(ROUND(hours, 2)), 2) FROM #thisWeek
  

3.999999999999999e + 001

当您在求和之前进行舍入时,您将累积误差而不是原始数字。有些会围绕,有些会围绕。在这种情况下,向下舍入误差大于向上舍入误差,足以将舍入数字的总和推到40以下,但很难看到,因为这也是最终结果的四舍五入。

我不相信这个问题比四舍五入更为圆滑,因为如果在求和之前将舍入值强制转换为小数,则其他人报告的行为正确。在任何情况下,由于显示工具将结果四舍五入到一个偶数40,你无法看到它正在工作“。

答案 2 :(得分:1)

我想如果你比较一个十进制值,你的查询将起作用

 select sum(Round(hours,2)) hours, time_user,  MAX(cdate) maxdate

    from

    #thisweek2

    group by time_user
    having  sum(Round(hours,2)) < 40.0

答案 3 :(得分:1)

这似乎是一个舍入误差。当我对您示例中的第一个查询进行此修改时:

select convert(int,sum(Round(hours,2))) from #thisweek group by time_user

我得到39和39&lt; 40.因此SQL正确处理您的HAVING子句。

答案 4 :(得分:1)

在源头解决您的问题。而不是:

create table #thisweek (
    [hours] float,
    time_user varchar(100)
)

使用适当的数据类型:

create table #thisweek (
    [hours] decimal(10, 4),
    time_user varchar(100)
)

不要使用float。使用定点算术。大多数小数位都不重要 - 你真的测量工作时间到几分之一秒吗?即使是4位小数也可能是矫枉过正 - 即大约0.3秒内。

其余代码可以使用,但更简单地写为:

select sum(hours) as hours, time_user
from #thisweek
group by time_user
having sum(hours) < 40;

答案 5 :(得分:1)

您需要的是一种解决这个问题的方法。

首先,您知道至少有一条错误记录,因此只为该用户添加一个where条件并删除条件,从而隔离该数据。

然后逐步构建查询,直到您看到它离开轨道的位置。类似的东西:

select *
from
#thisweek2
where time_user = 1234

select hours , time_user
from
#thisweek2
where time_user = 1234

select hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
where time_user = 1234
group by hours, time_user

select Round(hours,2) hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
group by  Round(hours,2),time_user
having convert(int,sum(Round(hours,2))) < 40

select sum(Round(hours,2)) hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
group by time_user
where time_user = 1234

select sum(Round(hours,2)) hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
group by time_user
having sum(hours) < 40


select sum(Round(hours,2)) hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
group by time_user
having sum(Round(hours,2)) < 40


select sum(Round(hours,2)) hours, time_user,  MAX(cdate) maxdate
from
#thisweek2
group by time_user
having convert(int,sum(Round(hours,2))) < 40

对于不使用sum的早期查询,请手动添加结果。