MS Access:左外连接与不等运算符

时间:2018-04-26 23:04:02

标签: ms-access

我在Access 2010中有一个表:

TMP [CUST,ITEM,START_PD]

我想获得每个CUST / ITEM的END_PD。

END_PD被定义为相同CUST / ITEM的下一个更高START_PD之前的时段。

所以我在START_PD上使用不等式运算符执行表的左连接,如下所示。

SELECT s.CUST, s.ITEM, s.START_PD, Min(e.START_PD-1) AS END_PD
FROM TMP AS s 
LEFT JOIN TMP AS e ON s.CUST=e.CUST AND s.ITEM=e.ITEM AND e.START_PD>s.START_PD
GROUP BY s.CUST, s.ITEM, s.START_PD
ORDER BY s.CUST, s.ITEM, s.START_PD

基表有46,556行。我希望查询结果具有相同的,但查询只返回14,967行。

即使我尝试返回左连接中的所有记录,我也远远少于基表。见下文:

SELECT s.*,e.*
FROM TMP AS s 
LEFT JOIN TMP AS e ON s.ITEM = e.ITEM AND s.CUST = e.CUST AND e.START_PD>s.START_PD

上述查询仅返回19,014条记录...小于基表。

这是一个重大项目,我很感激任何帮助。到目前为止,它看起来像一个Access错误。任何解决方法?

编辑: 我尝试通过包含WHERE s.CUST='WALMART' AND s.ITEM='0001H'来测试一小部分数据。这可以通过排除最后一个没有更大的START_PD来实现。

CUST    ITEM    START_PD    END_PD
WALMART 0001HAC 20694   20696
WALMART 0001HAC 20697   20704
WALMART 0001HAC 20705   20706

奇怪的是,如果我将相同的数据样本(WALMART / 0001H)选择到一个单独的表中并在该较小的表上运行EXACT SAME查询(仅更改表名),则可以正常工作,如下所示。这就是为什么我倾向于认为这是一个错误。

CUST    ITEM    START_PD    END_PD
WALMART 0001HAC 20694   20696
WALMART 0001HAC 20697   20704
WALMART 0001HAC 20705   20706
WALMART 0001HAC 20707

2 个答案:

答案 0 :(得分:0)

任何"非典型" (不仅是table1.field = table2.field,但是除了相等之外的任何常量,计算或比较运算符)加入Access应该将它的ON子句包围在大括号中:

SELECT s.*,e.*
FROM TMP AS s 
LEFT JOIN TMP AS e ON (s.ITEM = e.ITEM AND s.CUST = e.CUST AND e.START_PD>s.START_PD)

但是,通常,这些类型的连接会返回不支持的连接表达式错误,而不是提供不正确的结果。我不知道为什么这个没有。

答案 1 :(得分:0)

要回答我自己的问题,我想:

  1. 证明这是一个错误
  2. 提供解决方法
  3. 证明这是一个错误。

        Sub Test()
            Dim db As DAO.Database, rst As DAO.Recordset
            Set db = CurrentDb
    
            On Error Resume Next
            db.QueryDefs.Delete "TMP_EXTENDED"
            db.QueryDefs.Delete "RAW_LEFT_JOIN"
            db.TableDefs.Delete "TMP"
            On Error GoTo 0
    
            'Table definition.
            strSql = _
            "CREATE TABLE TMP ( " & vbCrLf & _
            "  CUST VARCHAR(10), " & vbCrLf & _
            "  ITEM VARCHAR(10), " & vbCrLf & _
            "  START_PD LONG, " & vbCrLf & _
            "  PRIMARY KEY (CUST,ITEM,START_PD) " & vbCrLf & _
            ");"
            db.Execute strSql, dbFailOnError
    
            'Populate with data.
            Set rst = db.OpenRecordset("TMP")
            For custNo = 1 To 25      'change to affect final row count
                For itemNo = 1 To 100 'change to affect final row count
                    For pdNo = 1 To 3 'change to affect final row count
                        strCust = "CUST" & custNo
                        strItem = "ITEM" & itemNo
                        rst.AddNew
                        rst("CUST") = strCust
                        rst("ITEM") = strItem
                        rst("START_PD") = pdNo
                        rst.Update
                        If rst.RecordCount Mod 1000 = 0 Then Debug.Print rst.RecordCount 'just to monitor.
                    Next
                Next
            Next
            Debug.Print "TMP Table Row Count is: " & DCount("*", "TMP")
    
            'Test query to find end period for each CUST/ITEM/START_PD.
            Dim qdf As New QueryDef
            qdf.Name = "TMP_EXTENDED"
            qdf.SQL = "SELECT s.CUST, s.ITEM, s.START_PD, Min(e.START_PD-1) AS END_PD " & vbCrLf & _
            "FROM TMP AS s  " & vbCrLf & _
            "LEFT JOIN TMP AS e ON (s.CUST=e.CUST AND s.ITEM=e.ITEM AND e.START_PD>s.START_PD) " & vbCrLf & _
            "GROUP BY s.CUST, s.ITEM, s.START_PD " & vbCrLf & _
            "ORDER BY s.CUST, s.ITEM, s.START_PD"
            db.QueryDefs.Append qdf
            Debug.Print "TMP_EXTENDED Row Count is: " & DCount("*", "TMP_EXTENDED")
    
            'Test query to just perform the left join.
            Set qdf = New QueryDef
            qdf.Name = "RAW_LEFT_JOIN"
            qdf.SQL = "SELECT s.*,e.* " & vbCrLf & _
            "FROM TMP AS s  " & vbCrLf & _
            "LEFT JOIN TMP AS e ON s.ITEM = e.ITEM AND s.CUST = e.CUST AND e.START_PD>s.START_PD"
            db.QueryDefs.Append qdf
            Debug.Print "RAW_LEFT_JOIN Row Count is: " & DCount("*", "RAW_LEFT_JOIN")
    
            RefreshDatabaseWindow
    
        End Sub
    

    以书面形式运行时,上面的代码返回:

    TMP Table Row Count is: 7500
    TMP_EXTENDED Row Count is: 5000
    RAW_LEFT_JOIN Row Count is: 7500
    

    显然这是错误的,因为左连接应始终返回左表中的所有记录。在这种情况下,TMP_EXTENDED应该返回7500,而RAW_LEFT_JOIN应该已经超过 7500返回。

    可以更改custNoitemNopdNo的循环边界以调整表TMP中的记录计数。如果您执行此操作,您将看到 查询有效,直到记录计数达到约7000,然后失败

    当仅在一个实体上执行联接时,似乎不存在同样的问题。柱。例如,我修改了上面的代码以使用仅包含CUST列和START_PD列的表,并使两个查询在表记录计数为600,000时正常工作。

    解决方法

    自发布此内容以来,我发现了另一个非常相似的post,其中提供了一个不错的workaround。我已将其修改如下。我不知道它会变得多么可靠,但我现在还在用它。

        SELECT s.CUST, s.ITEM, s.START_PD, Min(e.START_PD) AS END_PD
        FROM TMP AS s INNER JOIN TMP AS e ON s.ITEM = e.ITEM AND s.CUST = e.CUST
        WHERE e.START_PD>s.START_PD
        GROUP BY s.CUST, s.ITEM, s.START_PD
    
        UNION ALL
    
        SELECT CUST, ITEM, Max(START_PD), Null 
        FROM TMP
        GROUP BY CUST, ITEM, Null
        ORDER BY CUST,ITEM,START_PD,END_PD