我有数据代表不同条件下患者哮喘症状的严重程度。严重性变量是有序因子,都具有相同的水平(温和<中等<严重)。这是一个简化的例子:
# Create example data frame
df <- data.frame(
ID = c(1:5),
Daytime = c("Mild", "Severe", "Mild", "Moderate", "Moderate"), # severity of daytime symptoms
Sleep = c("Moderate", NA, "Mild", "Mild", "Moderate"), # severity of nighttime symptoms
Activity = c("Mild", "Moderate", "Mild", "Moderate", "Severe") # severity of symptoms during activity
)
# Specify order of factor levels
df$Daytime <- ordered(
df$Daytime,
levels = c("Mild",
"Moderate",
"Severe")
)
df$Sleep <- ordered(
df$Sleep,
levels = c("Mild",
"Moderate",
"Severe")
)
df$Activity <- ordered(
df$Activity,
levels = c("Mild",
"Moderate",
"Severe")
)
df
结果数据框如下所示:
ID Daytime Sleep Activity
1 1 Mild Moderate Mild
2 2 Severe <NA> Moderate
3 3 Mild Mild Mild
4 4 Moderate Mild Moderate
5 5 Moderate Moderate Severe
我正在尝试创建一个“整体严重程度”变量,其中患者的总体严重程度=三种类别(白天,睡眠和活动)中报告的最严重症状。也就是说,“整体”等于“白天”,“睡眠”和“活动”的最高级别。结果如下:
ID Daytime Sleep Activity Overall
1 1 Mild Moderate Mild Moderate
2 2 Severe <NA> Moderate Severe
3 3 Mild Mild Mild Mild
4 4 Moderate Mild Moderate Moderate
5 5 Moderate Moderate Severe Severe
我想在不写一些大而笨重的for
循环的情况下这样做,但我无法弄清楚如何。我想也许我可以用ave()
来做,但似乎不能同时处理多个变量:
> df$Overall <- ave(c(df$Daytime, df$Sleep, df$Activity),
+ df$ID,
+ FUN = function(i) max (i, na.rm=T)
+ )
Error in `$<-.data.frame`(`*tmp*`, "Worst", value = c(2L, 3L, 1L, 2L, :
replacement has 15 rows, data has 5
是否有可以执行此操作的应用功能?
答案 0 :(得分:4)
这样做的一个简单方法是:
df$Overall <- apply(df[,2:4], 1, max, na.rm=T)