计算MySQL中的重叠

时间:2016-05-10 20:40:18

标签: mysql sql circos

我试图找出哪些类之间的重叠最多。数据存储在MySQL中,并且每个学生在数据库中为他/她所拥有的每个类都有一个完全独立的行(我没有配置它,我无法更改它)。我粘贴了下表的简化版本。实际上有大约20种不同的课程。

CREATE TABLE classes
(`student_id` int, `class` varchar(13));
INSERT INTO classes
(`student_id`, `class`)
VALUES
(55421, 'algebra'),
(27494, 'algebra'),
(64934, 'algebra'),
(65364, 'algebra'),
(21102, 'algebra'),
(90734, 'algebra'),
(20103, 'algebra'),
(57450, 'gym'),
(76411, 'gym'),
(24918, 'gym'),
(65364, 'gym'),
(55421, 'gym'),
(89607, 'world_history'),
(54522, 'world_history'),
(49581, 'world_history'),
(84155, 'world_history'),
(55421, 'world_history'),
(57450, 'world_history');

我最终想要使用Circos(background here),但我会对任何允许我理解并向人们展示最重叠和最少重叠的方法感到满意。这是我的头脑,但我想我可以使用一个输出表,每个路线有一行和一列,并列出不同类相交的重叠数。每个与自身相交的路线可以显示与任何其他类别没有重叠的人数。

Screenshot of a 3x3 matrix from Excel

2 个答案:

答案 0 :(得分:1)

只需使用自我加入和聚合:

select c1.class, c2.class, count(*)
from classes c1 join
     classes c2
     on c1.student_id = c2.student_id
group by c1.class, c2.class;

这并没有以完全相同的格式生成它。

答案 1 :(得分:1)

您可以通过生成表示链接的结果来执行此操作:src - > dst = nb

1)获取矩阵

{
    "data":[
        {
            "type": "Person",
            "id": 1,
            "attributes": {
                "name": "John",
            },
            "relationships": {
                "cars": [
                    {
                        "data": {
                            "type": "Car",
                            "id": 1,
                            "attributes": {
                                "brand": "Bugatti",
                                "model": "Veyron",
                                "plate": "PAD-305",
                            },
                        },
                    },
                    {
                        "data": {
                            "type": "Car",
                            "id": 2,
                            "attributes": {
                                "brand": "Bugatti",
                                "model": "Chiron",
                                "plate": "MAD-054",
                            },
                        },
                    },
                ],
            },
        },

        {
            "type": "Person",
            "id": 2,
            "attributes": {
                "name": "Charllot",
            },
            "relationships": {
                "cars": [
                    {
                        "data": {
                            "type": "Car",
                            "id": 3,
                            "attributes": {
                                "brand": "Volkswagen",
                                "model": "Passat CC",
                                "plate": "OIJ-210",
                            },
                        },
                    },
                    {
                        "data": {
                            "type": "Car",
                            "id": 4,
                            "attributes": {
                                "brand": "Audi",
                                "model": "A6",
                                "plate": "NAD-004",
                            },
                        },
                    },
                ],
            },
        }
    ],

    "meta":{
        "backend_runtime": "300ms", // processed at the view
    }
}

"选择不同的类"没有必要生成矩阵,你可以直接选择类和GROUP BY。但是,在第2步,我们需要这种独特的结果。

结果:

select c1.class src_class, c2.class dst_class
from (select distinct class from classes) c1
join (select distinct class from classes) c2
order by src_class, dst_class

2)加入与来源和目的地匹配的学生名单

src_class      dst_class
-----------------------------
algebra        algebra
algebra        gym
algebra        world_history
gym            algebra
gym            gym
gym            world_history
world_history  algebra
world_history  gym
world_history  world_history

不同的值(步骤1)允许我们获取所有类,即使它们没有链接(并且改为0)。

结果:

select c1.class src_class, c2.class dst_class, count(v.student_id) overlap
from (select distinct class from classes) c1
join (select distinct class from classes) c2
left join classes v on
(
    v.class = c1.class
    and v.student_id in (select student_id from classes
                         where class = c2.class)
)
group by src_class, dst_class
order by src_class, dst_class

3 - 如果类等于

,则进行不同的计算
src_class      dst_class      overlap
-------------------------------------
algebra        algebra           7
algebra        gym               2
algebra        world_history     1
gym            algebra           2
gym            gym               5
gym            world_history     2
world_history  algebra           1
world_history  gym               2
world_history  world_history     6

结果:

select c1.class src_class, c2.class dst_class, count(v.student_id) overlap
from (select distinct class from classes) c1
join (select distinct class from classes) c2
left join classes v on
(
    v.class = c1.class and
    (
        -- When classes are equals
        -- Students presents only in that class
        (c1.class = c2.class
         and 1 = (select count(*) from classes
                  where student_id = v.student_id))
    or
        -- When classes are differents
        -- Students present in both classes
        (c1.class != c2.class
         and v.student_id in (select student_id from classes
                              where class = c2.class))
    )
)
group by src_class, dst_class
order by src_class, dst_class