MySQL中内部联接和外部联接的混合

时间:2018-07-17 13:03:53

标签: mysql sql

在这种情况下如何构造我的SQL遇到了麻烦。 我有3张桌子:

人员表:

df_start = df.sample(1)

for index, row in df_start.iterrows():

        total_count = row['count']
        df1 = row.values

        while total_count < 400:
            df_tmp = df.sample(25)
            total_count = total_count + df_tmp['count'].sum()
            df1 = df1 + df_tmp.sum()

FACT_1表:

ID

--

A

FACT_2表:

Person_ID DAY metric

--------------------

A         1   x

A         2   y

我希望结果是:

Person_ID DAY metric

--------------------

A         3   a

A         2   b

因此,这就像个人ID和日期分别与每个事实表的外部联接..但是当个人和日期相同时,我需要将两个指标捆绑在一起。 事实表可能很大,因此请记住这一点。

抱歉,格式不熟悉。

3 个答案:

答案 0 :(得分:1)

Live demo here

您可以通过在MySQL不支持的事实表上执行FULL JOIN来获得结果,但是可以使用两个带有LEFT JOIN的查询来模拟它,然后与UNION结合使用。在这两个查询中,我们检查person子句的WHERE表中是否存在人员(两次是为了尽快限制要处理的行数):

SELECT
  COALESCE(f.p1, f.p2) as person_id,
  COALESCE(f.d1, f.d2) as day,
  m1 as metric1,
  m2 as metric2
FROM (
SELECT f1.person_id as p1,f1.day as d1,f1.metric as m1,f2.person_id as p2,f2.day as d2,f2.metric as m2 
FROM fact_1 f1
LEFT JOIN fact_2 f2 ON f1.person_id = f2.person_id and f1.day = f2.day
WHERE EXISTS (SELECT 1 FROM person p WHERE p.id = f1.person_id)
UNION
SELECT f1.person_id as p1,f1.day as d1,f1.metric as m1,f2.person_id as p2,f2.day as d2,f2.metric as m2 
FROM fact_2 f2
LEFT JOIN fact_1 f1 ON f1.person_id = f2.person_id and f1.day = f2.day
WHERE EXISTS (SELECT 1 FROM person p WHERE p.id = f2.person_id)
) f
ORDER BY person_id, day

这给出结果:

person_id     day   metric1    metric2
---------------------------------------
A              1       x        null
A              2       y         b
A              3      null       a

如果您确信person_id在事实表中是正确的(您已经在外键约束中强制执行了它,或者以某种方式对其进行了检查),则可以跳过WHERE EXISTS检查以提高性能。

考虑在fact_1(person_id, day)fact_2(person_id, day)上创建索引。

答案 1 :(得分:1)

另一种选择是创建唯一日期的记录集:

 select DAY from FACT_1 
 union select DAY from FACT_2

您还可以获得以数字序列表示的天数(如果使用的是最新版本的MySQL,即使使用递归CTE也是如此):

select * from (
    select 1
    union all select 2
    union all select 3
    -- ...
) Days

您可以CROSS JOINPerson表,然后左联接每个FACT表以获得所需的内容:

select
    Person.`ID`
    ,Days.Day
    ,FACT_1.metric metric1
    ,FACT_2.metric metric2
from Person
    cross join 
    (    select DAY from FACT_1 
         union select DAY from FACT_2
    ) DAYS 
    left join FACT_1 on
        FACT_1.Person_ID = Person.`ID`
        and FACT_1.Day = Days.Day
    left join FACT_2 on
        FACT_2.Person_ID = Person.`ID`
        and FACT_2.Day = Days.Day

SQL小提琴here

答案 2 :(得分:0)

除非要从“人”表中获取其他数据,否则此查询不需要它。不过,如果需要,您可以将其加入UNION。

SELECT
   u.Person_ID
  ,u.DAY
  ,MAX(u.metric1) AS metric1
  ,MAX(u.metric2) AS metric2
FROM 
(
  SELECT
     f1.Person_ID
    ,f1.DAY
    ,f1.metric AS metric1
    ,NULL AS metric2
  FROM 
    Fact_1 AS f1
  UNION ALL
  SELECT
     f2.Person_ID
    ,f2.DAY
    ,NULL AS metric1
    ,f2.metric AS metric2
  FROM 
    Fact_2 AS f2
) AS u
GROUP BY
   u.Person_ID
  ,u.DAY

Results:
+-----------+-----+---------+---------+
| Person_ID | DAY | metric1 | metric2 |
+-----------+-----+---------+---------+
| A         |   1 | x       | NULL    |
| A         |   2 | y       | b       |
| A         |   3 | NULL    | a       |
+-----------+-----+---------+---------+