我在R中编写了一个函数,它将包含字母等级的数据框转换为数字等级。然后,我在数据框的每一列上使用sapply()。有没有更简单的方法来做到这一点,不需要三次单独调用sapply?有没有办法将函数应用于数据框的每个元素而不是每个行或列?
源数据"成绩",如下所示:
grades <- read.table("Grades.txt", header = TRUE)
head(grades)
final_exam quiz_avg homework_avg
1 C A A
2 C- B- A
3 D+ B+ A
4 B+ B+ A
5 F B+ A
6 B A- A
我的&#34; convert_grades
&#34;功能看起来像这样:
convert_grades <- function(x) {
if (x == "A+") {
x <- 4.3
} else if (x == "A") {
x <- 4
} else if (x == "A-") {
x <- 3.7
} else if (x == "B+") {
x <- 3.3
} else if (x == "B") {
x <- 3
} else if (x == "B-") {
x <- 2.7
} else if (x == "C+") {
x <- 2.3
} else if (x == "C") {
x <- 2
} else if (x == "C-") {
x <- 1.7
} else if (x == "D+") {
x <- 1.3
} else if (x == "D") {
x <- 1
} else if (x == "D-") {
x <- 0.7
} else if (x == "F") {
x <- 0
} else {
x <- NA
}
return(x)
}
我目前的做法如下:
num_grades <- grades
num_grades[, 1] <- sapply(grades[, 1], convert_grades)
num_grades[, 2] <- sapply(grades[, 2], convert_grades)
num_grades[, 3] <- sapply(grades[, 3], convert_grades)
head(num_grades)
final_exam quiz_avg homework_avg
1 2.0 4.0 4
2 1.7 2.7 4
3 1.3 3.3 4
4 3.3 3.3 4
5 0.0 3.3 4
6 3.0 3.7 4
答案 0 :(得分:13)
我会按如下方式重写您的convert_grades
函数:
convert_grades <- function(x) {
A <- factor(x, levels=c("A+", "A", "A-",
"B+", "B", "B-",
"C+", "C", "C-",
"D+", "D", "D-", "F"))
values <- c(4.3, 4, 3.7,
3.3, 3, 2.7,
2.3, 2, 1.7,
1.3, 1, 0.7, 0)
values[A]
}
然后,我会像这样进行转换:
num_grades <- grades
num_grades[] <- lapply(num_grades, convert_grades)
num_grades
final_exam quiz_avg homework_avg
1 2.0 4.0 4
2 1.7 2.7 4
3 1.3 3.3 4
4 3.3 3.3 4
5 0.0 3.3 4
6 3.0 3.7 4
答案 1 :(得分:11)
首先对您的函数进行矢量化:您可以使用ifelse
或:
grvec <- c("A+"=4.3,"A"=4,"A-"=3.7,"B+"=3.3,"B"=3,"B-"=2.7,
"C+"=2.3,"C"=2,"C-"=1.7,"D+"=1.3,"D"=1,"D-"= 0.7,
"F"=0)
grades <- data.frame(final_exam=c("C","C-","D+"),
quiz_avg=c("A","B-","Q"))
## final_exam quiz_avg
## 1 C A
## 2 C- B-
## 3 D+ Q
num_grades <- apply(grades,2,function(x) grvec[as.character(x)])
## final_exam quiz_avg
## [1,] 2.0 4.0
## [2,] 1.7 2.7
## [3,] 1.3 NA
答案 2 :(得分:1)
这是一种非常快速的哈希方法,可以让你获得更多的成绩:
library(qdap)
grvec <- list("A+"=4.3,"A"=4,"A-"=3.7,"B+"=3.3,"B"=3,"B-"=2.7,
"C+"=2.3,"C"=2,"C-"=1.7,"D+"=1.3,"D"=1,"D-"= 0.7, "F"=0)
dat[] <- lapply(dat, lookup, list2df(grvec)[, c(2:1)])
## final_exam quiz_avg homework_avg
## 1 2.0 4.0 4
## 2 1.7 2.7 4
## 3 1.3 3.3 4
## 4 3.3 3.3 4
## 5 0.0 3.3 4
## 6 3.0 3.7 4