通过另一个匹配的列因子

时间:2016-08-04 11:14:47

标签: r dataframe

我在R中有一个由LOS和多个更广泛条件组成的数据框

LOS             Condition
  1                Spinal
  2               Urology
  1              Thoracic
  8                Spinal
  5               Billary
 ...                  ...

我想找到每个更广泛条件下LOS的变化,有没有简单的方法来做到这一点?

任何建议都将不胜感激,谢谢!

下面的可重复的类似数据集

data <- structure(list(LOS = c(6, 6, 13, 6, 19, 7), Condition = structure(c(37L, 15L, 24L, 15L, 15L, 15L), .Label = c("Acute Liver Failure", "Aortic Disease", "Arthritis and Limb Deformity/Fractures", "Asphyxiation", "Billary", "Bowel Infection/Perforation/Infarction", "Breast Cancer", "Cancer (Unoperated)", "Cardiac Arrest", "Cardiac Arythmia", "Cerebral Aneurysm (Non-Ruptured)", "Cerebral Infarction", "Cerebral Oedema", "Chronic Liver Disease", "COPD/Asthma/Respiratory Failure", "Drug Overdose and Poisoning", "Ear/Nose/Throat", "Electrolyte", "Encephalitis", "Endocrine", "Epilepsy", "Gastroectomy", "Gynaecological Cancer/Surgery", "Heart Failure", "Hydrocephalus", "Hyperventilation Syndromes", "Infection incl. unspecified", "Influenza", "Interstitial Pulmonary Disease", "Large Bowel Cancer", "Max Fax Surgeries", "Meningitis", "Myocardial Infarction", "Neuro-Surgical Cancer", "Obesity", "Other Inter-Cerebral Haemmorhage", "Pancreatitis", "Perforation of Oesophagus", "Peripheral Vascular Disease (Inlc. Ischaemia and Infarction", "Pleural Effusion", "Pneumonia", "Psychiatric", "Pulmonary/Veno-Thrombo Embollism", "Skin Inflammation/Infection", "Skull and Facial Fractures", "Spinal Cord Weakness", "Spinal Surgery/Fractures", "Spinal Trauma", "Sub-Arachnoid Haemmorhage", "Systemic Weakness", "Thoracic/Abdominal Aortic Aneurysm (Non-Ruptured)", "Thoracic/Abdominal Aortic Aneurysm (Ruptured incl. injury)", "Trauma to Intra-Abdominal Organs/Vessels", "Trauma to Thoracic Cage", "Traumatic Inter-Cerebral Haemmorhage/Contusions/Oedema", "Urology/Renal Surgery" ), class = "factor")), .Names = c("LOS", "Condition"), row.names = c(NA, 6L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

这将创建一个新的data.frame,其结果为:

res <- data.frame(condition = factor(, levels = levels(data$Condition)), varLos = numeric(0))
for (i in unique(data$Condition)){
  res[nrow(res) + 1,] <- c(as.character(i), var(data[data$Condition == i, "LOS"], na.rm = T))
}
res
#                         condition           varLos
# 1                    Pancreatitis             <NA>
# 2 COPD/Asthma/Respiratory Failure 40.3333333333333
# 3                   Heart Failure             <NA>

引入NA值,因为只有一个值没有差异。使用您的数据集(显然可以包含更多观察结果),不应创建这些数据集。