以下是该场景:在SQL Server 2008 R2数据库中,表A具有StudyID,VisitCode和VisitSequenceNumber。表B具有StudyID,SubjectID和VisitCode。
表A包含给定StudyID的所有可能的VisitCodes(此表中有超过200个StudyID,每个都有自己的一组VisitCodes)。
表B包含给定StudyID的所有SubjectID,以及每个主题的所有VisitID(此表中有超过200个StudyID,每个都有自己的SubjectID集合。)
我需要为StudyID创建一个所有SubjectID的列表,同时使用SubjectID没有的所有VisitID。我不能只列出缺少访问次数的SubjectID,我需要确定每个主题缺少哪些访问。
因此,如果对于StudyID'C1234'表A有这个:
StudyID | VisitCode | VisitSequenceNumber C1234 | V100A | 100 C1234 | V110A | 110 C1234 | V120A | 120 C1234 | UNS | 999
表B有:
StudyID |SubjectID | VisitCode C1234 | 01-001 | V100A C1234 | 01-001 | V120A C1234 | 01-001 | UNS C1234 | 01-002 | V110A C1234 | 01-002 | UNS
我需要返回包含以下内容的行:
StudyID | SubjectID | VisitCode C1234 | 01-001 | V110A C1234 | 01-002 | V100A C1234 | 01-002 | V120A
由于某些原因,我似乎无法解决这个问题 - 如果根据定义而不是表B中的那些主题,我如何将主题与缺少的访问匹配?任何帮助将不胜感激!
答案 0 :(得分:4)
一种方法是在派生表中使用cross join
来生成StudyID,VisitCode和SubjectId的所有可能组合,然后保持与该集合的连接并过滤null以查找缺少的行:
select all_combo.studyid, all_combo.subjectid, all_combo.visitcode
from (
select a.studyid, a.visitcode, b.subjectid
from tableb b
cross join tablea a
group by a.studyid, a.visitcode, b.subjectid
) all_combo
left join tableb b
on all_combo.VisitCode = b.VisitCode
and all_combo.StudyID = b.StudyID
and all_combo.SubjectID = b.SubjectID
where b.StudyID is null
and all_combo.StudyID = 'C1234' -- you might have to limit to the specific StudyID
order by all_combo.SubjectID;
您的样本数据的结果:
| studyid | subjectid | visitcode |
|---------|-----------|-----------|
| C1234 | 01-001 | V110A |
| C1234 | 01-002 | V100A |
| C1234 | 01-002 | V120A |
答案 1 :(得分:2)
使这一点变得困难的一个原因是你没有一张表来确定哪个科目与一个学习者有关,所以我们必须从b中得出它。使用CTE就可以做到这一点。对于tableB的JOIN也很奇怪,因为它使用了{StudyId,subjectID}的独特集合以及可能的访问
WITH subjects
AS (SELECT DISTINCT studyid,
subjectid
FROM tableb)
SELECT s.studyid,
s.subjectid,
a.visitcode
FROM subjects s
INNER JOIN tablea a
ON s.studyid = a.studyid
LEFT JOIN tableb b
ON a.studyid = b.studyid
AND a.visitcode = b.visitcode
AND s.subjectid = b.subjectid
WHERE b.studyid IS NULL
ORDER BY s.studyid,
s.subjectid
答案 2 :(得分:1)
您正试图找出主题缺失的访问次数,我相信会有另一个主题表格,例如TableC并且与研究有关系,因此加入TableA和TableC将给出超级科目x访问量,减去TableB,并且你得到了对象的缺失访问量。
SELECT StudyID
, SubjectID
, VisitCode
FROM (SELECT a.StudyID
, c.SubjectID
, a.VisitCode
FROM TableA a
LEFT JOIN TableC c ON a.StudyID = c.StudyID
EXCEPT
SELECT *
FROM TableB)
WHERE StudyID = 'C1234'
答案 3 :(得分:1)
您必须为所需的subjectID
和visitCodes
引入某种主列表。
修改强>
我从tableB(此处:限制为'C1234')获取studyID
的所有可能值,再次从tableB获取subjectID
的所有值,并从tableA获取visitCodes
的所有值。之后,我在所有这些可能的值之间运行连接,并检查组合是否已存在于tableB
SELECT sti studyId, sid studyId, vc visitCode
FROM
( SELECT DISTINCT studyID sti FROM tableB ) stis
INNER JOIN
( SELECT DISTINCT StudyID ssi,subjectID sid FROM tableB) s ON ssi=sti
INNER JOIN
( SELECT DISTINCT StudyID vsi,visitCode vc FROM tableA ) v ON vsi=sti
WHERE NOT EXISTS (SELECT 1 FROM tableB
WHERE StudyID=sti AND subjectID=sid AND visitCode=vc)
AND sti = 'C1234' -- to limit the example to the current study
请参阅此处查看工作演示(MySQL):http://sqlfiddle.com/#!9/cc91b/11
或者这里(T-SQL 2014):demo on data.stackexchange