计算和绘制均值的功能

时间:2018-06-29 13:32:28

标签: r function loops ggplot2 data-manipulation

我可以通过分类变量的级别来计算变量的平均值,然后使用以下方式绘制这些平均值:

SELECT DISTINCT
    PAT.PAT_ID,
    PAT.PAT_NAME AS [PATIENT]   
    ,CONVERT(DATE,HSP.HOSP_ADMSN_TIME,101) AS [ADMISSION DATE]
    ,CONVERT(DATE,HSP.INP_ADM_DATE,101) AS [INPATIENT ADM DATE]
    ,CONVERT(DATE,HSP.HOSP_DISCH_TIME,101) AS [DISCHARGE DATE]
    ,ZPC.NAME AS [PATIENT CLASS]
    ,FAD.INP_LENGTH_OF_STAY AS [INP LENGTH OF STAY]
    ,CASE WHEN CONVERT(DATE,HNO.CRT_INST_LOCAL_DTTM,101) IS NOT NULL AND 
     FAD.INP_LENGTH_OF_STAY IS NOT NULL  THEN(
         CASE WHEN 
             ((COUNT(
                CASE WHEN 
                ((SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                 AS VARCHAR(10)),1,5)) >= '07:00' AND 
                 (SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                 AS VARCHAR(10)),1,5)) <= '19:00') 
                THEN 1 END) OVER (PARTITION BY PAT.PAT_ID, 
                CONVERT(DATE,HNO.CRT_INST_LOCAL_DTTM,101))
               ) >=2 
              AND
             (COUNT(
               CASE WHEN 
                ((SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) < '07:00' OR 
                (SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) > '19:00') 
               THEN 1 END ) OVER (PARTITION BY PAT.PAT_ID, 
                CONVERT(DATE,HNO.CRT_INST_LOCAL_DTTM,101))
              ) >=2 )
        THEN 'YES' 
        WHEN ((COUNT(
                CASE WHEN 
                ((SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) >= '07:00' AND 
                (SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) <= '19:00') 
                THEN 1 END) OVER (PARTITION BY PAT.PAT_ID, 
                CONVERT(DATE,HNO.CRT_INST_LOCAL_DTTM,101))
             ) < 2 
             OR 
             (COUNT( 
                CASE WHEN 
                ((SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) < '07:00' OR 
                (SUBSTRING(CAST(CONVERT(TIME,HNO.CRT_INST_LOCAL_DTTM,108) 
                AS VARCHAR(10)),1,5)) > '19:00') 
                THEN 1 END) OVER (PARTITION BY PAT.PAT_ID, 
               CONVERT(DATE,HNO.CRT_INST_LOCAL_DTTM,101))) < 2 ) 
       THEN 'NO' ELSE NULL END) 
   END AS [TWO POC NOTES PER SHIFT?]

FROM    PAT_ENC_HSP HSP
        LEFT JOIN ZC_PAT_CLASS ZPC ON HSP.ADT_PAT_CLASS_C = 
        ZPC.ADT_PAT_CLASS_C 
        INNER JOIN PAT_ENC PEN ON HSP.INPATIENT_DATA_ID = 
        PEN.INPATIENT_DATA_ID
        LEFT JOIN HNO_INFO HNO ON HSP.INPATIENT_DATA_ID = 
        HNO.INPATIENT_DATA_ID 
        INNER JOIN PATIENT PAT ON HSP.PAT_ID = PAT.PAT_ID
        INNER JOIN VALID_PATIENT VP ON PAT.PAT_ID = VP.PAT_ID
        LEFT JOIN F_IP_HSP_ADMISSION FAD ON HSP.PAT_ENC_CSN_ID = 
        FAD.PAT_ENC_CSN_ID

WHERE   VP.IS_VALID_PAT_YN = 'Y'
        AND PEN.ENC_TYPE_C IN ('3') 
        AND HSP.ADMIT_CONF_STAT_C NOT IN ('3') AND HNO.IP_NOTE_TYPE_C = 
        '1000001'
        AND HSP.ADT_PAT_CLASS_C = '101'
ORDER BY PAT.PAT_NAME

我想跨library(tidyverse) library(Rmisc) data.abc <- data %>% filter(outcome!= "NA" & abc.q != "NA") means.d <- summarySE(data.abc, measurevar="outcome", groupvars="abc.q", na.rm = TRUE) out.abc.plot <- ggplot(data=means.d, aes(x=abc.q, y=outcome, ymin=outcome-ci, ymax=outcome+ci)) + geom_pointrange() + ylab("Mean (95% CI)") + ggtitle("outcome: abc quartile") 中每个变量的类别来计算和绘制out中每个变量的平均值。是否有一个函数/循环可用于自动化上面的代码。

groups

0 个答案:

没有答案