在SQl服务器中构建比较矩阵

时间:2013-07-30 20:55:46

标签: sql sql-server database

关于复杂的SQL查询,我有一个问题。 (我使用的是sql server)在我的数据库中,我有一组Classes,以及这些类具有的一组技能(以多对多的关系)。

我想构建一个构建矩阵比较的查询(类似于How to create a matrix with SQL中的那个)但是如果可能的话,不要对每列进行硬编码。 我希望通过以下方式比较每个类对的重叠量:

overlap = (Skills_Both_Classes_Have / Total_Skills) * 100

我的主要问题是如何以我可以修改/显示的方式找到每对类的重叠。

类表的格式化:

|ClassID |   ClassName |
------------------------
|1       |    Class1   |
|2       |    Class2   |
|3       |    Class3   |

技能表的格式化:

|SkillID |   SkillName |
------------------------
|1       |    Skill1   |
|2       |    Skill2   |
|3       |    Skill3   |

中位数表的格式:

|ClassID |SkillID|
------------------
|1       |   1   |
|1       |   2   |
|1       |   3   |
|2       |   2   |
|2       |   4   |
|2       |   5   |
|3       |   1   |
|3       |   2   |
|3       |   5   |

示例输出:

       |Class1 |Class2 |Class3 | 
--------------------------------
Class1 |  100  |  033  |  066  |
Class2 |  033  |  100  |  066  |
Class3 |  066  |  066  |  100  |

我一直在使用数据透视等功能,但我无法绕过在SQL中实现此功能的最佳方式。

在任何其他语言中,我会为每个循环使用一对,并将输出发送到数组中,但这似乎不是SQL中非常好的解决方案。这不是为了作业或任何事情,仅仅是出于个人的好奇心。

1 个答案:

答案 0 :(得分:1)

我设法编写了一个可以获得我想要的内容的查询,但并不像我想的那样容易阅读。

--Builds a list of all combinations of classes
SELECT [ClassID], [ClassName] INTO #classnames FROM [Glad1].[dbo].[Classes] 
SELECT a.classname AS Class1, a.ClassID AS Class1ID, b.classname AS Class2, b.ClassID AS    Class2ID INTO #combos 
FROM #classnames a cross join #classnames b
SELECT Classes.ClassName,COUNT(*) AS SkillCount INTO #skillcounts FROM [Glad1].[dbo].[ClassSkills]
RIGHT JOIN [Glad1].[dbo].[Classes]
ON ClassSkills.ClassID=Classes.ClassID
GROUP BY ClassName


--Finds the percent overlap for each class combination
SELECT ClassOne, ClassTwo,CAST(ROUND(((SharedSkills * 2.0) / (sc1.skillCount + sc2.SkillCount) * 100.0),0) AS DECIMAL(8,0)) 
AS PercentOverlap INTO #percentoverlaps FROM
(SELECT cn1.ClassName AS ClassOne, cn2.ClassName AS ClassTwo, SharedSkills FROM #classnames cn1 JOIN
(SELECT Class1ID, Class2ID, Count(Class1Skills.[SkillID]) AS SharedSkills FROM
(SELECT DISTINCT [Class1ID],[SkillID] FROM [Glad1].[dbo].[ClassSkills] 
JOIN #combos ON #combos.Class1ID = ClassSkills.ClassID) Class1Skills
INNER JOIN
(SELECT DISTINCT [Class2ID], [SkillID] FROM [Glad1].[dbo].[ClassSkills] 
JOIN #combos ON #combos.Class2ID = ClassSkills.ClassID) CLass2Skills
ON Class1Skills.[SkillID] = Class2Skills.[SkillID]
Group by Class1ID, CLass2ID
) AllSharedSkills
ON cn1.ClassID = Class1ID
JOIN #classnames cn2
ON cn2.ClassID = Class2ID) Named
JOIN #skillcounts sc1
ON sc1.ClassName = ClassOne
JOIN #skillcounts sc2
ON sc2.ClassName = ClassTwo

--Dynamically builds the columns to turn the results into a matrix of percent overlap
DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX);

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.ClassTwo) 
            FROM #percentoverlaps c
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT ClassOne, ' + @cols + ' from 
            (
                select ClassOne, ClassTwo, PercentOverlap
                from #percentoverlaps
           ) x
            pivot 
            (
                 max(PercentOverlap)
                for ClassTwo in (' + @cols + ')
            ) p '


execute(@query)



--cleans up temp tables
DROP TABLE #classnames
DROP TABLE #combos
DROP TABLE #skillcounts
DROP TABLE #percentoverlaps