自动化数据框元素划分

时间:2014-10-09 15:05:42

标签: r automation

我有一个dataframe,我希望从中获取数据集中的百分比//其中%treat =已处理/总访问次数

例如。 %治疗急性上颌窦炎= 93470/93470 = 100%

dput(droplevels(head(magma)))

structure(list(DIAG_CODE_1 = structure(c(1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("4610 SINUSITIS MAXILLARY ACUT", "4619 SINUSITIS ACUTE UNSP"
), class = "factor"), GENDER = structure(c(1L, 1L, 1L, 1L, 1L, 
1L), .Label = "FEMALE", class = "factor"), AGE = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = "0-2", class = "factor"), Mention_DRGU = c(5460L, 
5460L, 17790L, 17790L, 9400L, 9400L), treatment_status = structure(c(1L, 
2L, 1L, 2L, 1L, 2L), .Label = c("Total visits", "Treated"), class = "factor"), 
    diag_class_1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Acute sinusitis", class = "factor"), 
    year = c(2007L, 2007L, 2007L, 2007L, 2008L, 2008L)), .Names = c("DIAG_CODE_1", 
"GENDER", "AGE", "Mention_DRGU", "treatment_status", "diag_class_1", 
"year"), row.names = c(1285L, 1286L, 1407L, 1410L, 1408L, 1411L
), class = "data.frame")

然而,有432行,我可以手动计算所有这些,但这将是非常耗时的。不是计算机的用途:p。如果你们能帮助我找到自动化R中任务的方法,那将非常感激。

R有没有办法创建一个结果数据框,告诉我DIAG_CODE_1,GENDER,AGE,%处理和年份?我已经(在Excel中)创建了我想要output的样子,所以你们可以看到我的意思。

output

我将对其他呼吸系统疾病做这种计算,所以我现在想要学习,从长远来看,我可以让生活更轻松。

2 个答案:

答案 0 :(得分:1)

试试这个:

magma2<-reshape(magma, idvar = c("DIAG_CODE_1","GENDER","AGE","diag_class_1","year"), timevar = "treatment_status", direction = "wide")

colnames(magma2)<-c("DIAG_CODE_1","GENDER","AGE","diag_class_1","year","Treated","TotVisits")

magma2$PercentageTreated<-as.numeric(as.character(magma2$Treated))/as.numeric(as.character(magma2$TotVisits))

head(magma2)

答案 1 :(得分:1)

您可以使用dplyr

library(dplyr)
library(tidyr)

 magma %>% 
       spread(treatment_status, Mention_DRGU) %>%
       mutate(PercentageTreated=100*(Treated/`Total visits`)) %>% 
       select(-diag_class_1, -`Total visits`, -Treated)
 #                    DIAG_CODE_1 GENDER AGE year PercentageTreated
 #1 4610 SINUSITIS MAXILLARY ACUT FEMALE 0-2 2007               100
 #2     4619 SINUSITIS ACUTE UNSP FEMALE 0-2 2007               100
 #3     4619 SINUSITIS ACUTE UNSP FEMALE 0-2 2008               100