Question

我有一个基本上是PHQ-9问卷答复的数据集。其中有9列，其中包含“完全不”，“有时”，“数天”，“半天以上”，“几乎每天”因素。其得分分别为0、1、1、2、3。最后，对所有9个问题的回答均给出了27的PHQ得分。

但是，在我的数据集中，我对这些问题的回答存储为：

$利息：具有5个级别的因子“超过一半的天数”，..：1 4 2 2 4 5 4 4 4 5 ...

现在，我真正想要的是与上述每个要素相邻的另一列，其中包含相应的分数。此外，最后我想使用这些因子得分来计算结果，以得出抑郁得分。

这是我正在查看的输出：

Interest    I_Factor Pleasure        P_factor  Score 
Not at all    0      Nearly Everyday  2          2

Answer 1

为您创建一个模拟数据框：

df <- data.frame(id = c("001", "002", "003", "004", "005"),
             PHQ_1 = c("Not at all", "Not at all", "Sometimes", "Sometimes", "Several Days"),
             PHQ_2 = c("Sometimes", "Sometimes", "Several Days", "More than half the days", "Nearly everyday"))

使用mutate_at为您选择调查表项目，然后从recode包中大量应用psych，以将李克特量表从因子更改为数值。为新列指定一个“名称”，它们不会替换旧列（例如，下面的示例中为“ numeric_columns”）。

完成此操作后，再次使用mutate计算行总和并将其放入新列中。

library(dplyr)
library(psych)

test <- df %>%
  mutate_at(vars(PHQ_1:PHQ_2), funs(numeric_columns = recode(., 
                                       "Not at all" = 0,
                                       "Sometimes" = 1,
                                       "Several Days" = 1,
                                       "More than half the days" = 2,
                                       "Nearly everyday" = 3))) %>%
  mutate(total = rowSums(select(., contains("numeric_columns"))))

示例输出如下。原始列将保留，您将拥有数字格式的新列以及问卷的总分。

   id        PHQ_1                   PHQ_2 PHQ_1_numeric_columns PHQ_2_numeric_columns total
1 001   Not at all               Sometimes                     0                     1     1
2 002   Not at all               Sometimes                     0                     1     1
3 003    Sometimes            Several Days                     1                     1     2
4 004    Sometimes More than half the days                     1                     2     3
5 005 Several Days         Nearly everyday                     1                     3     4

如何更改包含数据框中每个要素因素的列？

1 个答案: