查询逻辑回归,存在多个

时间:2012-07-24 13:29:18

标签: sql sql-server-2008 tsql

逻辑回归是由唯一识别数字组成的,后跟多个二元变量(总是1或0),具体取决于某人是否符合某些标准。下面我有一个查询列出了其中几个二进制条件。只有四个这样的标准,查询运行时间比我想象的要长一些。是否有比下面更有效的方法?注意。 tblicd是一个大型表查找表,其文本表示为15k +行。查询没有任何意义,只是一个概念证明。我的复合键上有适当的索引。

select  patient.patientid 
,case when exists
(
    select c.patientid from tblclaims as c
    inner join patient as p on p.patientid=c.patientid
    and c.admissiondate = p.admissiondate
    and c.dischargedate = p.dischargedate
    where patient.patientid = p.patientid
    group by c.patientid
    having count(*) > 1000
    )
    then '1' else '0'
    end as moreThan1000
,case when exists
(
    select c.patientid from tblclaims as c
    inner join patient as p on p.patientid=c.patientid
    and c.admissiondate = p.admissiondate
    and c.dischargedate = p.dischargedate
    where patient.patientid = p.patientid
    group by c.patientid
    having count(*) > 1500
    )
    then '1' else '0'
    end as moreThan1500
,case when exists
(
    select distinct picd.patientid from patienticd as picd
    inner join patient as p on p.patientid= picd.patientid
    and picd.admissiondate = p.admissiondate
    and picd.dischargedate = p.dischargedate
    inner join tblicd as t on t.icd_id = picd.icd_id
    where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
    )
    then '1' else '0'
    end as diabetes
,case when exists
(
    select r.patientid, count(*) from patient as r
    where r.patientid = patient.patientid
    group by r.patientid
    having count(*) >1
    ) 
    then '1' else '0'
    end 


from patient
order by moreThan1000 desc

2 个答案:

答案 0 :(得分:2)

我首先使用from子句中的子查询:

select q.patientid, moreThan1000, moreThan1500,
       (case when d.patientid is not null then 1 else 0 end),
       (case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
     (select c.patientid,
             (case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
             (case when count(*) > 1500 then 1 else 0 end) as moreThan1500
      from tblclaims as c inner join
           patient as p
           on p.patientid=c.patientid and
              c.admissiondate = p.admissiondate and
              c.dischargedate = p.dischargedate
      group by c.patientid
     ) q
     on p.patientid = q.patientid left outer join
     (select distinct picd.patientid
      from patienticd as picd inner join
           patient as p
           on p.patientid= picd.patientid and
              picd.admissiondate = p.admissiondate and
              picd.dischargedate = p.dischargedate inner join
          tblicd as t
          on t.icd_id = picd.icd_id
      where t.descrip like '%diabetes%'
     ) d
     on p.patientid = d.patientid left outer join
     (select r.patientid, count(*) as cnt
      from patient as r
      group by r.patientid
      having count(*) >1
     ) pc
     on p.patientid = pc.patientid
order by 2 desc

然后,您可以通过组合它们来更简化这些子查询(例如,外部查询中的“p”和“pc”可以合并为一个)。但是,如果没有相关的子查询,SQL Server应该会更容易优化查询。

答案 1 :(得分:1)

请求左连接的示例...

SELECT
    patientid,
    ISNULL(CondA.ConditionA,0) as IsConditionA,
    ISNULL(CondB.ConditionB,0) as IsConditionB,
    ....
FROM
    patient
        LEFT JOIN
    (SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
        ON patient.patientid = CondA.patientID
        LEFT JOIN
    (SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
        ON patient.patientid = CondB.patientID

如果您的条件查询仅返回最多一行,则可以将它们简化为

    (SELECT patientid, 1 as ConditionA from ... where ... ) CondA