在最后没有好处之后总结所有记录

时间:2012-03-26 16:00:15

标签: sql ms-access query-optimization jet

我在制造环境中工作,我们设置了一个Access数据库,用于录制没有好的部分。我们目前正在处理的两个表是Occurrences和Sort。出现记录问题的细节和排序记录排序的部件数量好/没有好处。我有一个查询,我正在尝试优化,总结自上次找到缺陷后排序的部件数量。这是当前(非常混乱)的查询:

SELECT Sort.[OccurrenceID], Sum(Sort.Sorted) AS SumOfSorted
FROM Sort
WHERE 
    (((Sort.SortDate)>(select top 1 dupe.sortdate from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc)))
OR (
     ((Sort.SortDate)>=(select top 1 dupe.sortdate from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc))
 AND ((Sort.SortShift)>(select top 1 dupe.sortshift from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc)))
OR (
     ((Sort.SortDate)=(select top 1 dupe.sortdate from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc))
 AND ((Sort.SortShift)=(select top 1 dupe.sortshift from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc))
 AND ((Sort.ID)>(select top 1 dupe.id from Sort as dupe where (((dupe.[OccurrenceID])=(Sort.[OccurrenceID])) and (((dupe.repaired)<>0) or ((dupe.scrapped)<>0))) order by dupe.sortdate desc, dupe.sortshift desc, dupe.id desc)))
GROUP BY Sort.[OccurrenceID];

这可行,但需要永远运行。我试图将'dupe'子查询重构为自己的堆栈查询,最后得到以下我称为SortRejects的查询:

SELECT Sort.[OccurrenceID], Sort.SortDate, Sort.SortShift, Sort.ID, Sort.Scrapped, Sort.Repaired
FROM Sort
WHERE (((Sort.Scrapped)<>0)) OR (((Sort.Repaired)<>0))
ORDER BY Sort.SortDate DESC , Sort.SortShift DESC , Sort.ID DESC;

新的最终查询:

SELECT Sort.[OccurrenceID], Sum(Sort.Sorted) AS SumOfSorted
FROM Sort
WHERE 
   (((Sort.sortdate)>(select top 1 SortRejects.sortdate from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID])))))
OR (
     ((Sort.sortdate)>=(select top 1 SortRejects.sortdate from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID]))))
 AND ((Sort.sortshift)>(select top 1 SortRejects.sortshift from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID])))))
OR (
     ((Sort.sortdate)=(select top 1 SortRejects.sortdate from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID]))))
 AND ((Sort.sortshift)=(select top 1 SortRejects.sortshift from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID]))))
 AND ((Sort.ID)>(select top 1 SortRejects.id from SortRejects where ((SortRejects.[OccurrenceID])=(Sort.[OccurrenceID])))))
GROUP BY Sort.[OccurrenceID];

结果更快但不会返回相同的结果。在这种情况下,我是否遗漏了某些内容或堆叠查询与子查询的工作方式不同?

2 个答案:

答案 0 :(得分:0)

嗯......我对Access语法和使用不太了解,所以这可能不完全适用。

堆叠查询可能会生成一个临时表,已排序。 SQL中的表(不太确定Access)具有无固有顺序。因此,当它引用堆栈查询行时,访问计划会根据引用的列生成更改,因为它会尝试优化所采用的路径。

无论如何,我对这些查询中的任何一个都不太满意,有些事情只是让我失望。这可能是SELECT TOP 1的不断重复使用 - 通常我更喜欢使用几个CTE来实现这种效果,但我不认为它们在Access中可用。所以这里有一些变化:

首先,您可能希望使用SortDateSortShiftSort.Id更改为仅使用简单的时间戳(您尝试查找自特定问题发生以来的所有内容,右?)。

另外,你关心的是'之前'行的值太多,当你应该关心的是当前(所需)行的值时(我属于这个行)偶尔也会)。您不关心“顶部”行本身 - 我们的结果需要大于所有行中的缺陷。

因此,如果Sort.id不断增加,并且值 NOT 重复使用,您应该只能使用 单值,就像这样(可能)在Access语法中):

SELECT a.[OccurrenceId], SUM(a.sorted) as totalSorted
FROM Sort as a
LEFT JOIN Sort as b
ON b.[OccurrenceId] = a.[OccurrenceId]
AND (b.scrapped > 0  -- This is a count, right?
     OR b.repaired > 0)  -- see above
AND b.id > a.id
WHERE b.id IS NULL
GROUP BY a.[OccurrenceId]

对于给定的事件,基本上得到没有“更大”行(Sort.id)且scrappedrepaired不为正的所有行。请注意您的查询编写方式,如果“最大”行有一些报废或修复,它将返回该行,但没有迹象表明它是“失败”运行。

如果Sort.id被重复使用(可能是每班次,只要它被使用),它会变得更加复杂:

SELECT a.OccurrenceId, SUM(a.sorted) as totalSorted
FROM Sort as a
LEFT JOIN Sort as b
ON b.[OccurrenceId] = a.[OccurrenceId]
AND (b.scrapped > 0
     OR b.repaired > 0)
AND (b.sortDate > a.sortDate
     OR (b.sortDate = a.sortDate
         AND b.sortShift > a.sortShift)
     OR (b.sortDate = a.sortDate
         AND b.sortShift = a.sortShift
         AND b.id > a.id))
WHERE b.id IS NULL
GROUP BY a.[OccurrenceId]

我不保证这些版本中的任何一个都会更快,尽管它们可能会更快。然而,它们不那么“倾斜”了。

答案 1 :(得分:0)

由于在WHERE子句中使用可能对Sort表中的每个记录执行的子SELECT查询,原始查询很可能很慢。

如果我在哪里,我会使用VBA将每个SELECT TOP 1减少到他们正在检查的单个值,然后将其重新注入主查询。

Public Function GetMessyQuerySQL() As String

    ' First, extract our where-clause data SortDate, SortShift, and ID'
    Dim sql As String
    sql = sql & "SELECT TOP 1 sortdate, sortshift, id "
    sql = sql & "FROM   Sort "
    sql = sql & "WHERE  (OccurrenceID = OccurrenceID) "
    sql = sql & "        AND (Sort.repaired <> 0) "
    sql = sql & "              OR (Sort.scrapped<>0) "
    sql = sql & "ORDER  BY sortdate DESC, "
    sql = sql & "          sortshift DESC, "
    sql = sql & "          id DESC;"

    Dim SortDate as Variant, SortShift As Variant, ID as Variant

    ' We assume there is always a valid data return by that query! '
    Set rs = CurrentDb().OpenRecordset(sql, dbOpenForwardOnly)
    ' Convert dates to litteral dates in the format #29/03/2012# '
    SortDate = Format(rs(0), "\#mm\/dd\/yyyy\#") 
    SortShift = Format(rs(1), "\#mm\/dd\/yyyy\#") 
    SortID = rs(2)
    rs.Close
    Set rs = Nothing

    ' Now, modify the main query with our data from the first '
    sql = vbNullString
    sql = sql & "SELECT OccurrenceID, "
    sql = sql & "       SUM(Sorted) AS SumOfSorted "
    sql = sql & "FROM   Sort "
    sql = sql & "WHERE  (SortDate > %SortDate) "
    sql = sql & "        OR ((SortDate >= %SortDate) "
    sql = sql & "            AND (SortShift > %SortShift)) "
    sql = sql & "        OR ((SortDate = %SortDate) "
    sql = sql & "            AND (SortShift = %SortShift) "
    sql = sql & "            AND (ID > %SortID)) "
    sql = sql & "GROUP  BY OccurrenceID;  "

    ' Replace the %xxx% by their proper values '
    sql = Replace(sql, "%SortDate", SortDate)
    sql = Replace(sql, "%SortShift", SortShift)
    sql = Replace(sql, "%SortID", SortID)

    ' Just return the constructed SQL '
    GetMessyQuerySQL = SQL
End Sub

所以调用这个函数会得到像这样的SQL代码:

SELECT OccurrenceID,
       SUM(Sorted) AS SumOfSorted
FROM   Sort
WHERE  (SortDate > #17/02/2012#)
        OR ((SortDate >= #17/02/2012#)
             AND (SortShift > #29/03/2012#))
        OR ((SortDate = #17/02/2012#)
             AND (SortShift = #29/03/2012#)
             AND (ID > 25))
GROUP  BY OccurrenceID; 

无论如何,这只是基本原则,你可以更进一步,如果你的查询被绑定到数据表格式,你可以在表单的Open事件中做类似的事情:

Private Sub Form_Load()
     Me.RecordSource = GetMessySQL()
End Sub