我有一个查询,我试图只检索有一个值的1个结果(电话号码字段,可以存储多个电话号码),但是我的结果是复制与供应商关联的每个电话号码的结果。以下是结果集的示例:
从上图中可以看出,NAME1的结果正在重复,因为PHONE字段有3个不同的值。我只想拉1个电话号码(无论哪一个,只要没有空值)。下面是我的sql代码示例,您可以看到注释掉的部分,我试图在C.PHONE的派生查询中捕获MAX值。
SELECT DISTINCT A.NAME1, SUM( A.REMIT_AMT), A.REMIT_VENDOR, (C.PHONE),
C.PHONE_TYPE
FROM PS_PAYMENT_TBL A, PS_VENDOR B, PS_VENDOR_ADDR_PHN C
WHERE A.PYMNT_DT >= '01-01-2017'
AND A.PYMNT_DT <= '12-31-2017'
AND A.REMIT_SETID = 'SHARE'
AND B.SETID = A.REMIT_SETID
AND B.VENDOR_ID = A.REMIT_VENDOR
AND B.VENDOR_CLASS <> 'E'
AND B.SETID = C.SETID
AND B.VENDOR_ID = C.VENDOR_ID
--AND C.PHONE =
--(SELECT MAX(C2.PHONE) FROM PS_VENDOR_ADDR_PHN C2)
AND C.EFFDT =
(SELECT MAX(C_ED.EFFDT) FROM PS_VENDOR_ADDR_PHN C_ED
WHERE C.SETID = C_ED.SETID
AND C.VENDOR_ID = C_ED.VENDOR_ID
AND C.ADDRESS_SEQ_NUM = C_ED.ADDRESS_SEQ_NUM
AND C_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
GROUP BY A.NAME1, A.REMIT_VENDOR, C.PHONE, C.PHONE_TYPE
ORDER BY 2 DESC
我不相信MS SQL Server像MySQL那样支持LIMIT功能,我可以使用与MS SQL Server兼容的类似内容吗?谢谢!
1/24更新:
SELECT DISTINCT A.NAME1, SUM( A.REMIT_AMT) As TOTAL_SPEND, A.REMIT_VENDOR,
C.FIRST_PHONE, C.FIRST_PHONE_TYPE
FROM
PS_PAYMENT_TBL A
LEFT JOIN (
SELECT DISTINCT VENDOR_ID,
FIRST_VALUE(PHONE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE,
FIRST_VALUE(PHONE_TYPE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE_TYPE
FROM PS_VENDOR_ADDR_PHN C
WHERE PHONE IS NOT NULL
) C ON A.REMIT_VENDOR = C.VENDOR_ID
, PS_VENDOR B , PS_VENDOR_ADDR_PHN CED
WHERE A.PYMNT_DT >= '01-01-2017'
AND A.PYMNT_DT <= '12-31-2017'
AND A.REMIT_SETID = 'SHARE'
AND B.SETID = A.REMIT_SETID
AND B.VENDOR_ID = A.REMIT_VENDOR
AND B.VENDOR_CLASS <> 'E'
AND B.SETID = CED.SETID
AND B.VENDOR_ID = C.VENDOR_ID
AND CED.EFFDT =
(SELECT MAX(CED.EFFDT) FROM PS_VENDOR_ADDR_PHN CED
WHERE CED.SETID = CED.SETID
AND CED.VENDOR_ID = CED.VENDOR_ID
AND CED.ADDRESS_SEQ_NUM = CED.ADDRESS_SEQ_NUM
AND CED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
GROUP BY A.NAME1, A.REMIT_VENDOR, C.FIRST_PHONE, C.FIRST_PHONE_TYPE
ORDER BY 2 DESC
答案 0 :(得分:1)
OP的问题是关于SQL Server中的LIMIT
,当实际问题是通过将一个表中的单个记录连接到具有多个记录的其他表(经典XY problem)而引入的重复记录时
让我们尝试确定何时引入重复记录。以下查询没有引入重复记录:
-- Single table, no joins
SELECT A.NAME1, SUM( A.REMIT_AMT), A.REMIT_VENDOR
FROM PS_PAYMENT_TBL A
WHERE
A.PYMNT_DT >= '01-01-2017'
AND A.PYMNT_DT <= '12-31-2017'
AND A.REMIT_SETID = 'SHARE'
GROUP BY A.NAME1, A.REMIT_VENDOR
据推测,这将返回以下内容:
NAME1 (No column name) REMIT_VENDOR
------------------------------ ---------------- ------------
Owens & Minor Inc 11828775.730 ERSUT
Besse Medical Supply 11807640.040 BESMED
Medtronic 11663552.040 MEDINB
Boston Scientific Corporation 8232800.420 SCIMED
Epic Systems Corporation 6841444.300 75128A
但是,由于每个供应商在PS_VENDOR_ADDR_PHN
表中可能有多个电话号码,一旦我们JOIN
这两个:
SELECT A.NAME1, SUM( A.REMIT_AMT), A.REMIT_VENDOR
FROM PS_PAYMENT_TBL A
INNER JOIN PS_VENDOR_ADDR_PHN C ON A.REMIT_VENDOR = C.VENDOR_ID
WHERE
A.PYMNT_DT >= '01-01-2017'
AND A.PYMNT_DT <= '12-31-2017'
AND A.REMIT_SETID = 'SHARE'
GROUP BY A.NAME1, A.REMIT_VENDOR
我们将获得重复记录。我们可能不会一开始就注意到它,只要我们在A.REMIT_VENDOR
分组;但是由于重复的记录,总数也会变得混乱。
我建议,而不是JOIN
在具有重复相关记录的表上,JOIN
在子查询上,每个A.REMIT_VENDOR
只有一条记录,因此不会引入重复记录。
SELECT A.NAME1, SUM( A.REMIT_AMT), A.REMIT_VENDOR, C.FIRST_PHONE, C.FIRST_PHONE_TYPE
FROM PS_PAYMENT_TBL A
LEFT JOIN (
-- This subquery returns the first PHONE and PHONE_TYPE, per VENDOR_ID
-- if the records were ordered by the PHONE in DESC order
-- FIRST_VALUE is a window function
SELECT DISTINCT VENDOR_ID,
FIRST_VALUE(PHONE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE,
FIRST_VALUE(PHONE_TYPE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE_TYPE
FROM PS_VENDOR_ADDR_PHN
WHERE PHONE IS NOT NULL
) C ON A.REMIT_VENDOR = C.VENDOR_ID
GROUP BY A.NAME1, A.REMIT_VENDOR
参考文献:
JOIN
代替WHERE
您的查询不使用JOIN
将多组数据相关联,而是将WHERE
条件应用于所有数据的笛卡尔积。我自己的感觉是,最好使用JOIN
来表达两组数据之间的关系,并使用WHERE
专门用于排除数据集中的记录;如果只是因为它可以更容易地调试这些类型的“一个表中的重复记录导致整个结果中的重复记录”问题,那么当您可以看到每个数据集如何与其他数据集相关时。 (请参阅here和here。)
我建议使用JOIN
来合并PS_VENDOR
表中的数据:
SELECT A.NAME1, SUM( A.REMIT_AMT) AS TOTAL_SPEND, A.REMIT_VENDOR,
C.FIRST_PHONE, C.FIRST_PHONE_TYPE
FROM PS_PAYMENT_TBL A
-- See the INNER JOIN here; it's now easier to understand how PS_PAYMENT_TABLE
-- and PS_VENDOR are related
INNER JOIN PS_VENDOR B
ON A.REMIT_SETID = B.SETID
AND A.REMIT_VENDOR = B.VENDOR_ID
LEFT JOIN (
SELECT DISTINCT VENDOR_ID,
FIRST_VALUE(PHONE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE,
FIRST_VALUE(PHONE_TYPE) OVER (
PARTITION BY VENDOR_ID
ORDER BY PHONE DESC
ROWS UNBOUNDED PRECEDING
) AS FIRST_PHONE_TYPE
FROM PS_VENDOR_ADDR_PHN C
WHERE PHONE IS NOT NULL
) C ON A.REMIT_VENDOR = C.VENDOR_ID,
WHERE
A.PYMNT_DT >= '01-01-2017'
AND A.PYMNT_DT <= '12-31-2017'
AND A.REMIT_SETID = 'SHARE'
-- with the JOIN, we can apply filtering conditions on data from the B table
AND B.VENDOR_CLASS <> 'E'
GROUP BY A.NAME1, A.REMIT_VENDOR
ORDER BY 2 DESC
EEFDT
字段(仍在审理中)
SQL Server的相应语法是:
AND C.PHONE =
(SELECT TOP 1 C2.PHONE FROM PS_VENDOR_ADDR_PHN C2)
这将返回单个任意PHONE
。要返回最大PHONE
,请对子查询记录进行排序:
AND C.PHONE =
(SELECT TOP 1 C2.PHONE FROM PS_VENDOR_ADDR_PHN C2 ORDER BY C2.PHONE DESC)
参考 - TOP clause