假设我有一个表mytable,如下所示:
sampleID rs A1 A2
--------------------------------
1001 rs123 A C
1001 rs124 T C
1001 rs125 A T
1001 rs126 A C
1002 rs122 A C
1002 rs123 T C
1002 rs124 T C
1002 rs125 A C
我想比较两个具有相同rs值的sampleID,以查看它们的A1和/或A2值是否匹配。
例如,以
SELECT sampleID as Sample1, rs as rs1, A1 as A1_1, A2 as A2_1 FROM mytable where sampleID = "1001"
SELECT sampleID as Sample2, rs as rs2, A1 as A1_2, A2 as A2_2 FROM mytable where sampleID = "1002"
我该如何编写一个SELECT语句来获取上面每个SELECT的结果,并加入rs1 = rs2并将A1_1与A1_2和A2_1与A2_2进行比较?
答案 0 :(得分:1)
我将在此处使用自连接来处理比较:
SELECT
t1.sampleID,
t2.sampleID,
t1.rs,
t1.A1,
t2.A1,
(t1.A1 = t2.A1) AS A1_comp,
t1.A2,
t2.A2,
(t1.A2 = t2.A2) AS A2_comp
FROM mytable t1
INNER JOIN mytable t2
ON t1.sampleID < t2.sampleID
WHERE
t1.rs = t2.rs
ORDER BY
t1.sampleID,
t2.sampleID,
t1.rs;
联接条件要求联接左侧的sampleID
严格小于右侧的{em> 。这确保了我们不会重复比较,也不会将相同样本与其自身进行比较。我们利用为A1
和A2
值选择布尔等式的优势,这是MySQL语法允许的。别名A1_comp
和A1_comp
将为0(不匹配)和1(匹配)。
答案 1 :(得分:0)
为了完整起见,我想对之前发布的答案进行略作修改。在此版本中,仅按照WHERE子句显示不匹配的A1 / A2列
SELECT
A.rs1 as rs, A.Sample1_A1, A.Sample1_A2, B.Sample2_A1, B.Sample2_A2
from
(
SELECT sampleID as Sample1, rs as rs1, A1 as Sample1_A1, A2 as Sample1_A2
FROM mytable where sampleID = "1001"
)A left join
(
SELECT sampleID as Sample2, rs as rs2, A1 as Sample2_A1, A2 as Sample2_A2
FROM mytable where sampleID = "1002"
)B on A.rs1=B.rs2
where A.Sample1_A1 != B.Sample2_A1 or A.Sample1_A2 != B.Sample2_A2