Question

我有30位患者，他们有100项临床数据，例如体重，BMI，腰围等，我想根据他们的疾病状况为所有患者获取均值和SD。例如，我的数据集如下

Patient_id   DateOfBirth       Sex     Weight1   Bmi1   Wasit1  Disease
204065       25-06-1995       Female    113.8    41.3   105.8   0
200214       09-12-1990       Female      90     35.6   108     1
191633       14-09-1971         Male    128.4    47     150     1
186156       22-09-1967         Male    157.3    51.4   145.6   0

我想根据他们的疾病状况输出信息

Disease weight1Mean  Weight1SD      BMI1Mean    BMI1SD     Waist1Mean  WaistSD  
  0        135           30.7         46.3       7.14       125.7       28.1
  1        109           27.1         41.3       8.06       129         29.7

Answer 1

your_df %>%
groupy_by(Disease) %>%
summarize(Weight1Mean = mean(Weight1),
Weight1SD = sd(Weight1
#Repeat for the rest of variables to sumamrize
)

您也可以使用summarize_at代替summarize：

#... %>%
summarize_at(vars(Weight1, BMI1, Waist1), list(Mean = mean, SD = sd))

或summarize_if：

#... %>%
summarize_if(is.numeric, list(Mean = mean, SD = sd))

如果要从汇总中排除数字变量，则可以将它们重新编码为因子，或使用select删除它们。

Answer 2

我们可以使用data.table

 library(data.table)
 setDT(df1)[, .(Weight1Mean = mean(Weight1), Weight1SD = sd(Weight1)), Disease]

数据帧中基于分类变量的均值和标准差函数

2 个答案: