我有一个数据框,其中的列包含“优秀,非常好,良好,公平,差”的级别。我想平均这些值,并以其他方式使用它们,将值5分配给“优秀”,将4分配给“非常好”,依此类推。
我的各种尝试都被数字值的默认分配似乎按字母顺序排列,因此“优秀”为1,“一般”为2,依此类推。
感谢您的帮助。
答案 0 :(得分:2)
我使用命名向量作为查找表:
options = c('Excellent' = 5, 'Very Good' = 4, 'Good' = 3, 'Fair' = 2, 'Poor' = 1)
df = data.frame(grade = sample(names(options), 100, replace = TRUE))
head(df)
grade
1 Very Good
2 Good
3 Excellent
4 Very Good
5 Fair
6 Good
df = within(df, {
grade_numeric = options[grade]
})
head(df)
grade grade_numeric
1 Very Good 1
2 Good 3
3 Excellent 5
4 Very Good 1
5 Fair 4
6 Good 3
答案 1 :(得分:2)
您是否需要将其作为有序因子?如果是这样,使用factor
可能是您最好的选择。
示例数据
column <- c("Excellent", "Very Good", "Good", "Fair", "Poor",
"Good", "Fair", "Poor")
col.f <- factor(column,
levels = c("Poor","Fair" , "Good" , "Very Good", "Excellent"),
labels = c("Poor","Fair" , "Good" , "Very Good", "Excellent"),
ordered = TRUE)
col.f
[1] Excellent Very Good Good Fair Poor Good Fair Poor
Levels: Poor < Fair < Good < Very Good < Excellent
然后,您可以调用as.numeric(col.f)
来获取数值。