我试图将各种数据集连接在一起以获得主表,尽管我设法在3个左连接之后保留行数,但是在下一步之后,它似乎增加了它。有什么想法吗?
使用3个联接查询
-------------------------------------------------------------------
--- STEP 4: ----------------
-------------------------------------------------------------------
SELECT
DISTINCT Table1.[Field1]
, Table1.[Field2]
, Table3.[Field3]
, Table1.[Field4]
, Table1.[Field5]
, Table1.[Field6]
, Table1.[Field7]
, Table1.[Field8]
, Table1.[Field9]
, Table1.[Field10]
FROM db1.dbo.raw_tbl_1 AS Table1
LEFT JOIN db2.dbo.tbl_2 Table2
ON Table1.Field7 = Table2.[Field13]
LEFT JOIN db2.dbo.tbl_3 Table3
ON CONVERT(INT,Table1.[Field2]) = Table3.Field14
LEFT JOIN db2.dbo.tbl_4Table4
ON Table2.Field17= Table4. Field15
WHERE Table2. Field17 IS NOT NULL
-- 2682270 rows (Desired row count)
使用4个连接查询(增加行数的那个)
-------------------------------------------------------------------
--- STEP 5: ----
-------------------------------------------------------------------
SELECT
DISTINCT Table1.[Field1]
, Table1.[Field2]
, Table3.[Field3]
, Table1.[Field4]
, Table1.[Field5]
, Table1.[Field6]
, Table1.[Field7]
, Table5.[Field11]
, Table6.[Field12]
, Table1.[Field8]
, Table1.[Field9]
, Table1.[Field10]
FROM db1.dbo.raw_tbl_1 AS Table1
LEFT JOIN db2.dbo.tbl_2 Table2
ON Table1.Field7 = Table2.[Field13]
LEFT JOIN db2.dbo.tbl_3 Table3
ON CONVERT(INT,Table1.[Field2]) = Table3. Field14
LEFT JOIN db2.dbo.tbl_4 Table4
ON Table2. Field17= Table4. Field15
LEFT JOIN db2.dbo.tbl_5 Table5
ON Table4. Field18= Table5. Field16
LEFT JOIN db2.dbo.tbl_6 Table6
ON Table5.[Field11] = CONVERT(INT,Table6.[Table6])
WHERE Table2.Field17 IS NOT NULL
答案 0 :(得分:9)
如果LEFT JOIN中的一个表具有多个对应值,则会创建一个新行。如果您不想要此行为,则需要使用聚合函数GROUP BY
。
更具体地说,如果您仅使用您加入的最后一个表(导致新行的表)进行查询,您将能够找到重复的行并决定如何处理它。
由于您提到最后一个连接导致了问题,这意味着Table6返回的行数比您预期的要多。你必须做类似的事情:
SELECT Table5.Field11, COUNT(Table6.Table6) AS row_count
FROM Table5
LEFT JOIN db2.dbo.tbl_6 Table6
ON Table5.[Field11] = CONVERT(INT,Table6.[Table6])
GROUP BY Field11
HAVING row_count > 1
(HAVING
子句假设您期望表之间有1对1的对应关系。如果没有,请将其保留。您必须手动扫描Table6返回的行数超出预期,然后修改查询或相应地删除数据。