SQL:进行配对和计数样本

时间:2019-03-04 20:33:41

标签: mysql sql

我有下表(示例):

ID |LOCATION|DAY           
1  | 1      |20190301   
1  | 2      |20190301  
1  | 3      |20190301  
1  | 1      |20190302   
1  | 4      |20190302  
1  | 4      |20190305     
1  | 5      |20190302   
2  | 4      |20190301       
2  | 1      |20190301   
2  | 3      |20190303   
2  | 2      |20190305  

其中ID是汽车号码,Location是位置ID,时间是YYYYMMDD。我想编写一个SQL查询来计算每个月(YYYYMM)中每个carID的“成对位置”的数量:汽车在位置i和j中存在多少次。也就是说,最终结果应类似于

ID|LOCATION 1|LOCATION 2|MONTH |count1|count 2  
1 | 1        |2         |201903| 2    | 1  
1 | 1        |3         |201903| 2    | 1  
1 | 1        |4         |201903| 2    | 2  
1 | 1        |5         |201903| 2    | 1   
1 | 2        |3         |201903| 1    | 1  
1 | 2        |4         |201903| 1    | 2  

其中count1是位置1的计数,count2是位置2的计数,我们为每对对location1和location2构造它。

要构建配对,我尝试过:

Select n1.location, n2.location
From
(
  Select location
  from table
) n1,
(
  Select location
  from table
) n2
Where n1.location < n2.location
Order by n1.location, n2.location

但是我想计算每个位置(count1,count2)的数量,而不是成对计算。

我可以在SQL子查询中执行此操作吗?任何意见,将不胜感激。

1 个答案:

答案 0 :(得分:2)

这是一个奇怪的请求。您正在寻找两个位置的独立计数,但要对齐成一行(这很奇怪,因为有很多重复的数据)。

您可以在加入 之前进行汇总:

with l as (
      select l.id, l.location, date_format(l.time, '%Y%m') as yyyymm,
             count(*) as cnt
      from carlocations l
      group by l.id, l.location, date_format(l.time, '%Y%m') 
     )
select l1.id, l1.location as location1, l2.location2, l1.yyyymm, l1.cnt as cnt2, l2.cnt as cnt2
from l l1 join
     l l2
     on l1.id = l2.id and l1.yyyymm = l2.yyyymm and 
        l1.location < l2.location;

MySQL 8+支持with。在早期版本中,您需要在from子句中重复子查询。

编辑:

没有CTE,它看起来像:

select l1.id, l1.location as location1, l2.location2, l1.yyyymm, l1.cnt as cnt2, l2.cnt as cnt2
from (select l.id, l.location, date_format(l.time, '%Y%m') as yyyymm,
             count(*) as cnt
      from carlocations l
      group by l.id, l.location, date_format(l.time, '%Y%m') 
     ) l1 join
     (select l.id, l.location, date_format(l.time, '%Y%m') as yyyymm,
             count(*) as cnt
      from carlocations l
      group by l.id, l.location, date_format(l.time, '%Y%m') 
     ) l2
     on l1.id = l2.id and l1.yyyymm = l2.yyyymm and 
        l1.location < l2.location;