我有一个数据框架,该数据框架在课程的多个方面进行了李克特评分(大约40列李克特得分类似下面的示例数据中的两列)。
并非所有行都包含有效分数。有效分数为1:5。无效的分数被分配为96:99或完全丢失。
我想为以下每个满意度列的每个ID创建一个平均分数:
1)筛选无效分数,
2)为每个id创建有效分数的平均值。
3)将每个ID的平均满意度得分放在下面的Skill.satisfaction.mean标记为[column.name] .mean的新列中
我在下面的单行中包含了一个示例数据框和该数据框的转换。
####sample score vector
possible.scores <-c(1:5, 96,97, 99,"")
####data frame
ratings <- data.frame(ID = c(rep(1:7, each =2), 8:10), Degree = c(rep("Double", times = 14), rep("Single", times = 3)),
Skill.satisfaction = sample(possible.scores, size = 17, replace = TRUE),
Social.satisfaction = sample(possible.scores, size = 17, replace = TRUE)
)
####transformation applied over one of the satisfaction scales
ratings<- ratings %>%
group_by(ID) %>%
filter(!Skill.satisfaction %in% c(96:99), Skill.satisfaction!="") %>%
mutate(Skill.satisfaction.mean = mean(as.numeric(Skill.satisfaction), na.rm = T))
答案 0 :(得分:1)
library(dplyr)
ratings %>%
group_by(ID) %>%
#Change satisfaction columns from factor into numeric
mutate_at(vars(-ID,-Degree), list(~as.numeric(as.character(.)))) %>%
#Get mean for values in 1:5
mutate_at(vars(-ID,-Degree), list(mean=~mean(.[. %in% 1:5], na.rm = T)))
# A tibble: 6 x 6
# Groups: ID [3]
ID Degree Skill.satisfaction Social.satisfaction Skill.satisfaction_mean Social.satisfaction_mean
<int> <fct> <dbl> <dbl> <dbl> <dbl>
1 1 Double 96 99 2 NaN
2 1 Double 2 97 2 NaN
3 2 Double 1 97 1 NaN
4 2 Double 97 NA 1 NaN
5 3 Double 96 96 NaN 3
6 3 Double 99 3 NaN 3