SQL

时间:2015-10-20 22:40:16

标签: mysql sql self-join correlated-subquery

我有一个放射学报告数据库,我已经为肺结节事件开采了。每位患者都有一个病历号,每个程序都有一个唯一的登记号。因此,MRN可以具有多个用于差异过程的登录号。入藏号码是递增的,因此如果患者有多个入藏号,则最大的入藏号是最新的过程。我需要:

  • 确定最早的(初始)研究
  • 找到inital
  • 后最快的下一个研究
  • 计算每个间隔之间的时差

我相信使用相关子查询可以解决这个问题。但是,我还不熟悉SQL来解决这个问题。我尝试自己加入表并找到每个子查询的最大加入。下面的一些示例代码用于制作数据集:

CREATE TABLE Stack_Example (Rank, Accession1, MRN1, Textbox2, Textbox47,Textbox43,Textbox45,ReadBy,SignedBy,Addendum1,ReadDate,SignedDate,Textbox49,Result,Impression,max_size_nodule, max_nodule_loc, max_nodule_type)


    INSERT INTO Stack_Example
VALUES ("10",   "33399", "001734",  "5/21/1965",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "COMB, YAHAIR", "YES", "12/19/2014 11:48",  "12/19/2014 17:50", "TEXT", "Results of Nodules!","Impressions of Nodules","3.0", "right middle lobe","None Found")

INSERT INTO Stack_Example
VALUES ("9",    "33104", "001734",  "5/21/1965",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "12/21/2013 06:52",  "01/21/2014 06:52", "TEXT", "Results of Nodules!","Impressions of Nodules","3.7", "right upper lobe","None Found")

INSERT INTO Stack_Example
VALUES ("9",    "33374", "001734",  "5/21/1965",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "01/21/2014 08:19",  "01/21/2014 06:52", "TEXT", "Results of Nodules!","Impressions of Nodules","2.1", "right lower lobe","None Found")

INSERT INTO Stack_Example
VALUES ("1",    "34453", "001734",  "5/21/1965",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "03/14/2014 09:14",  "03/14/2014 09:14", "TEXT", "Results of Nodules!","Impressions of Nodules","1.4", "left upper lobe","None Found")

INSERT INTO Stack_Example
VALUES ("1",    "27122", "80592",   "1/14/1984",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "06/26/2013 10:20",  "06/26/2013 10:20", "TEXT", "Results of Nodules!","Impressions of Nodules","2.5", "left upper lobe","None Found")

INSERT INTO Stack_Example
VALUES ("1",    "27248", "80592",   "1/14/1984",    "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "08/01/2013 06:23",  "08/01/2013 06:23", "TEXT", "Results of Nodules!","Impressions of Nodules","4.0", "left lower lobe","None Found")

INSERT INTO Stack_Example
VALUES ("1",    "28153", "35681",   "03/01/1990",   "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "09/14/2012 05:00",  "09/14/2012 05:00", "TEXT", "Results of Nodules!","Impressions of Nodules","4.0", "left lower lobe","None Found")

INSERT INTO Stack_Example
VALUES ("1",    "29007", "35681",   "03/01/1990",   "CTS",   "3341",    "ROUTINE",  "TUCK, YOURPANTSIN",    "PICK, YASELFUP",   "YES", "11/16/2012 08:23",  "11/16/2012 08:23", "TEXT", "Results of Nodules!","Impressions of Nodules","3.5", "right lower lobe","None Found")

显然这是假数据。我一直在尝试做的是用相关的子查询加入表格。像这样:

SELECT DISTINCT a.Accession1, a.MRN1, a.ReadDate, p.Accession1, p.ReadDate
FROM Stack_Example as a 
INNER JOIN Stack_Example as p on a.MRN1 = p.MRN1
WHERE a.Accession1 = 
(SELECT max(Accession1) 
FROM Stack_Example as b
WHERE a.MRN1 = b.MRN1 AND 
a.Accession1 != p. Accession1)
ORDER BY a.MRN1

理想情况下,我想要的是一个主表,每个患者在行和每个MRN作为列的加入时有一个MRN(除了加入的日期等)。像这样:

| MRN        | Accession (First Follow-up) | Date First Followup |Accession (Second Follow-up)..| Date Second Follow up | etc. 
|:-----------|----------------------------:|:-------------------:|
| 001734     |      33374                  |    ......     
| 80592      |      27248                  |   ......    

我相信我的子查询需要一系列左连接;但是,有更好的方法吗?一些患者有7次以上的随访。感谢任何帮助,并为长时间的解释感到抱歉。希望格式化没问题。

3 个答案:

答案 0 :(得分:1)

你走在正确的轨道上。你可以使用自联接和子查询来完成它。该表应在MRN1上与其自身连接,并且后一记录的Accession1等于该MRN1的最小Accession1,其大于第一记录的MRN1(下一个MRN1)。左连接允许查询报告所有记录,甚至是最后一个(没有后继记录)。

此查询生成所有相邻研究对:

 Select a.ReadDate ARead, b.ReadDate BRead, 
        b.ReadDate-A.ReadDate elapsed,
        a.*, b.*,
 From table a
    left Join table b
        on b.MRN1 = a.MRN1
           and b.Accession1 =
               (Select min(Accession1) From table
                where MRN1 = a.MRN1
                   and Accession1 > a.Accession1)

此查询生成前三个研究:

 Select a.ReadDate ARead, b.ReadDate BRead, c.ReadDate CRead, 
        b.ReadDate-A.ReadDate elapsedAB,
        c.ReadDate-b.ReadDate elapsedBCB
 From table a
    left Join table b
        on b.MRN1 = a.MRN1
           and b.Accession1 =
               (Select min(Accession1) From table
                where MRN1 = a.MRN1
                   and Accession1 > a.Accession1)
    left Join table c
        on c.MRN1 = a.MRN1
           and c.Accession1 =
               (Select min(Accession1) From table
                where MRN1 = a.MRN1
                   and Accession1 > b.Accession1)
 Where A.ReadDate =
      (Select Min(readDate) from table
       where MRN1 = a.MRN1)

答案 1 :(得分:0)

不确定您是想要所有范围还是前两个范围。查尔斯查询我相信提供所有。

这只是前两个。

SELECT *
FROM      YourTable as O -- oldest
LEFT JOIN YourTable as neO -- next oldest
       ON O.MRN1 = neO.MRN1
WHERE
      O.Accession1 = (SELECT MIN(Accession1)
                      FROM YourTable A
                      WHERE A.MRN1 = O.MRN1)

  AND neO.Accession1 = (SELECT MIN(Accession1)
                        FROM YourTable A
                        WHERE A.MRN1 = O.MRN1
                          AND A.Accession1 <> O.Accession1)

答案 2 :(得分:0)

您可能希望发布minimal working example,您的示例包含许多不需要的列,并使事情变得复杂。

以下架构在SQL Fiddle上,请参阅下文。我将Accession 34453的年份更改为2015,加入顺序和日期错误。

CREATE TABLE Stack_Example (
  Accession        VARCHAR(32),
  MRN              VARCHAR(32),
  ReadDate         DATETIME
);

INSERT INTO Stack_Example
VALUES ("33399", "001734", STR_TO_DATE("12/19/2014 11:48", "%m/%d/%Y %h:%i" )),
       ("33104", "001734", STR_TO_DATE("12/21/2013 06:52", "%m/%d/%Y %h:%i" )),
       ("33374", "001734", STR_TO_DATE("01/21/2014 08:19", "%m/%d/%Y %h:%i" )),
       ("34453", "001734", STR_TO_DATE("03/14/2015 09:14", "%m/%d/%Y %h:%i" )),
       ("27122", "80592",  STR_TO_DATE("06/26/2013 10:20", "%m/%d/%Y %h:%i" )),
       ("27248", "80592",  STR_TO_DATE("08/01/2013 06:23", "%m/%d/%Y %h:%i" )),
       ("28153", "35681",  STR_TO_DATE("09/14/2012 05:00", "%m/%d/%Y %h:%i" )),
       ("29007", "35681",  STR_TO_DATE("11/16/2012 08:23", "%m/%d/%Y %h:%i" ));

群组Concat所有以前的种质
您似乎希望拥有可变数量的列,或者动态创建这些列。据我所知,这不起作用。正如在其他答案中提出的那样,您必须为每列添加LEFT JOIN。但是,在MySQL中,您可以使用GROUP_CONCAT来连接组的值。您的值将不再位于单个列中,但结果可能接近您的预期。

接下来为了生成两个后续日期之间的差异,在PostgreSQL中你有窗口函数来实现这一点。在MySQL中,您可以使用nested sets or adjacent lists

SELECT S.MRN,
       GROUP_CONCAT( 'Acc: ', S.Accession,
                     ' Date: ', S.ReadDate, 
                     ' Days to prev.: ', IFNULL(Diff, 0) 
                     ORDER BY Accession SEPARATOR ' :: ' )
FROM (
  SELECT S0.MRN,
         S0.Accession,
         S0.ReadDate,
         TIMESTAMPDIFF(DAY, S1.ReadDate, S0.ReadDate) AS Diff
  FROM stack_example S0
  -- join on previous accession
  LEFT JOIN stack_example S1
    ON S1.MRN = S0.MRN
   AND S1.Accession = ( SELECT MAX(S2.Accession)
                        FROM stack_example S2
                        WHERE S2.MRN = S0.MRN
                          AND S2.Accession < S0.Accession )
) S
GROUP BY MRN;

可能接近你要找的东西,Results在SQL小提琴上。

|    MRN | Result                                                                                                                                                                                                                               |
|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 001734 | Acc: 33104 Date: 2013-12-21 06:52:00 Days to prev.: 0 :: Acc: 33374 Date: 2014-01-21 08:19:00 Days to prev.: 31 :: Acc: 33399 Date: 2014-12-19 11:48:00 Days to prev.: 332 :: Acc: 34453 Date: 2015-03-14 09:14:00 Days to prev.: 84 |
|  35681 | Acc: 28153 Date: 2012-09-14 05:00:00 Days to prev.: 0 :: Acc: 29007 Date: 2012-11-16 08:23:00 Days to prev.: 63                                                                                                                      |
|  80592 | Acc: 27122 Date: 2013-06-26 10:20:00 Days to prev.: 0 :: Acc: 27248 Date: 2013-08-01 06:23:00 Days to prev.: 35                                                                                                                      |

加入固定数量的先前种质
以下查询与Charles Bretana已发布的查询相同。它加入了固定数量的种质。这个查询的缺点是,您不会获得最新的种质,但是最早的种类是七/四种。

SELECT S0.MRN,
       S0.Accession, S0.ReadDate,
       0,
       S1.Accession, S1.ReadDate,
       TIMESTAMPDIFF(DAY, S0.ReadDate, S1.ReadDate),
       S2.Accession, S2.ReadDate,
       TIMESTAMPDIFF(DAY, S1.ReadDate, S2.ReadDate),
       S3.Accession, S3.ReadDate,
       TIMESTAMPDIFF(DAY, S2.ReadDate, S3.ReadDate)

FROM stack_example S0
LEFT JOIN stack_example S1
  ON S1.MRN = S0.MRN
 AND S1.Accession = ( SELECT MIN(SX.Accession)
                      FROM stack_example SX
                      WHERE SX.MRN = S0.MRN
                        AND SX.Accession > S0.Accession )
LEFT JOIN stack_example S2
  ON S2.MRN = S0.MRN
 AND S2.Accession = ( SELECT MIN(SX.Accession)
                      FROM stack_example SX
                      WHERE SX.MRN = S1.MRN
                        AND SX.Accession > S1.Accession )
LEFT JOIN stack_example S3
  ON S3.MRN = S0.MRN
 AND S3.Accession = ( SELECT MIN(SX.Accession)
                      FROM stack_example SX
                      WHERE SX.MRN = S2.MRN
                        AND SX.Accession > S2.Accession )

WHERE S0.Accession = ( SELECT MIN(SX.Accession)
                       FROM stack_example SX
                       WHERE SX.MRN = S0.MRN )
;

结果

|    MRN | Accession |                    ReadDate | 0 | Accession |                   ReadDate | TIMESTAMPDIFF | Accession |                   ReadDate | TIMESTAMPDIFF | Accession |                ReadDate | TIMESTAMPDIFF |
|--------|-----------|-----------------------------|---|-----------|----------------------------|---------------|-----------|----------------------------|---------------|-----------|-------------------------|---------------|
| 001734 |     33104 |  December, 21 2013 06:52:00 | 0 |     33374 |  January, 21 2014 08:19:00 |            31 |     33399 | December, 19 2014 11:48:00 |           332 |     34453 | March, 14 2015 09:14:00 |            84 |
|  80592 |     27122 |      June, 26 2013 10:20:00 | 0 |     27248 |   August, 01 2013 06:23:00 |            35 |    (null) |                     (null) |        (null) |    (null) |                  (null) |        (null) |
|  35681 |     28153 | September, 14 2012 05:00:00 | 0 |     29007 | November, 16 2012 08:23:00 |            63 |    (null) |                     (null) |        (null) |    (null) |                  (null) |        (null) |