我有一个非常简单的问题。假设,我给100名学生打了这样的分数:
set.seed(1234)
Marks <- rnorm(100, 55, 10)
z <- runif(100)
Gender <- ifelse(z < 0.5, "M", "F")
#Creating Data frame
Df <- data.frame(SNo = 1:100, Marks, Gender)
head(Df)
现在,我需要为学生提供成绩,但是男女的评分标准不同。评分标准是:
我设法解决了这个问题,但是我发现我的方法并不吸引人。我这样尝试过:
#1 Method
Grade = ifelse(Df$Gender == "M", cut(Df$Marks, breaks = c(0, 35, 45, 55, 101), labels = FALSE),
cut(Df$Marks, breaks = c(0, 40, 50, 60, 101), labels = FALSE))
Grade <- as.character(factor(Grade, labels = LETTERS[4:1]))
#2. Method
Gradef <- function(x, cp = c(35, 45, 55)) {
ifelse(x < cp[1], "D", ifelse(x < cp[2], "C", ifelse(x < cp[3], "B", "A")))
}
Grade2 <- ifelse(Df$Gender == "M", Gradef(Df$Marks), Gradef(Df$Marks, c(40, 50, 60)))
sum(Grade == Grade2) #both method give same grade
Df$Grade <- Grade
有人可以建议我更好的方法来解决同一问题吗?我不想在R中使用任何外部软件包。
谢谢
答案 0 :(得分:1)
mylist = list(F = c(35, 45, 55), M = c(40, 50, 60))
grades = c("D", "C", "B", "A")
Df$Grade = grades[1 + sapply(1:NROW(Df), function(i)
findInterval(Df$Marks[i], mylist[[Df$Gender[i]]]))]
head(Df, 10)
# SNo Marks Gender Grade
#1 1 42.92934 F C
#2 2 57.77429 F A
#3 3 65.84441 M A
#4 4 31.54302 F D
#5 5 59.29125 F A
#6 6 60.06056 F A
#7 7 49.25260 M C
#8 8 49.53368 M C
#9 9 49.35548 M C
#10 10 46.09962 F B
答案 1 :(得分:1)
鉴于您对高效的定义是更少的代码行,我想这就是您想要的,使用方法1,我们只需要第二行即可:
Grade = ifelse(Df$Gender == "M", as.vector(cut(Df$Marks, breaks = c(0, 34, 45, 56, 101), labels = c("D", "C", "B", "A"))),
as.vector(cut(Df$Marks, breaks = c(0, 39, 50, 61, 101), labels = c("D", "C", "B", "A"))))
> head(Grade)
[1] "C" "B" "A" "D" "B" "A"
因此需要一行代码。
注意:您可以通过替换代码中的每一段来使代码更加灵活,例如
labs <- c("D", "C", "B", "A")
然后将labs变量放入代码中,这样,您现在只需在顶部更改一部分代码,然后将您的函数重用于不同的评分系统等即可。
使用的代码:
set.seed(1234)
Marks <- rnorm(100, 55, 10)
z <- runif(100)
Gender <- ifelse(z < 0.5, "M", "F")
Df <- data.frame(SNo = 1:100, Marks, Gender)
答案 2 :(得分:1)
使用带有标签的Cut
是我的诀窍,上面的@ hector-haffenden很类似。不过,这是一步一步来的。
set.seed(1234)
#Marks <- rnorm(100, 55, 10)
Marks <- 1:100 #for verification
z <- runif(100)
Gender <- ifelse(z < 0.5, "M", "F")
#Creating Data frame
Df <- data.frame(SNo = 1:100, Marks, Gender)
head(Df)
cutsF<- cut(Df$Marks,breaks = c(0,35,45,55,100),labels = c('D','C','B','A') , right=F )
cutsM<- cut(Df$Marks,breaks = c(0,40,50,60,100),labels = c('D','C','B','A') , right=F )
Df$Grades= ifelse(Df$Gender=='F' , as.character(cutsF) ,as.character(cutsM ) )
# For sake of Verification :
Df$CutsF=cutsF
Df$cutsM= cutsM
head(Df ,20)
编辑:我已经编辑了代码,并将include.lowest
替换为right=False
。这将关闭左侧的组并满足小于35的条件。但是,这不适用于55/60。您可能需要改用54和59。