无法将等级(字符串)中的列条目更改为数值(GPA)

时间:2017-04-08 21:39:55

标签: r dplyr plyr

我有一个CSV文件,其中我有'等级'列,其中包含从'F'和'D +'到'A'(不是'A +')的条目(等级)。所以,我想要做的是将这些值从例如'A'转换为4.0(数值),'A-'转换为3.7(再次 - 数字)。

到目前为止,我尝试了 plyr 库中的重估值(),但它无效。

     filtered_data$Grade <- 
        as.numeric(as.character(revalue(filtered_data$Grade, 
        +                                   c("A"="4.0", "A-"="3.7",
        +                                     "B+" = "3.3", "B" = "3.0",
        +                                     "B-" = "2.7", "C+" = "2.3",
        +                                     "C" = "2.0", "C-" = "1.7",
        +                                     "D+" = "1.3", "D" = "1.0",
        +                                     "F" = "0.0"))))
        Error in revalue(filtered_data$Grade, c(A = "4.0", `A-` = "3.7", 
        `B+` = "3.3",  : 
        x is not a factor or a character vector.

我也试过用 as.numeric(as.character(foo))做一些技巧,但这也不行。

第三,“硬编码”方法也没有用,因为我试图实现 for loop 来更改列中的每个条目,但是我收到了这条消息

    Warning message:
    In `[<-.factor`(`*tmp*`, i, value = c(11L, 16L, 5L, 13L, 8L, 16L,  :
     invalid factor level, NA generated

提前致谢!

2 个答案:

答案 0 :(得分:1)

将您的成绩列更改为一个因子将使用您的第一种方法:

filtered_data<-data.frame(Grade=c("A","B+", "C", "A-","D","B", "B-","C+","C-","D+","F"))
filtered_data$Grade <- as.factor(filtered_data$Grade)

filtered_data$Grade <- revalue(filtered_data$Grade, 
   c("A"="4.0", "A-"="3.7",
     "B+" = "3.3", "B" = "3.0",
     "B-" = "2.7", "C+" = "2.3",
     "C" = "2.0", "C-" = "1.7",
     "D+" = "1.3", "D" = "1.0",
     "F" = "0.0"))

答案 1 :(得分:1)

我不确定您的错误发生在哪里,但我认为使用查找向量比使用新包和函数要简单得多:

> trans.vec=  c("A"="4.0", "A-"="3.7",
+    "B+" = "3.3", "B" = "3.0",
+    "B-" = "2.7", "C+" = "2.3",
+    "C" = "2.0", "C-" = "1.7",
+    "D+" = "1.3", "D" = "1.0",
+    "F" = "0.0")

创建了一个命名向量。然后,您可以通过应用于该向量的提取函数来推送“成绩”列的值:

> filtered_data$num.char <- trans.vec[filtered_data$Grade]
> filtered_data
   Grade num.char
1      A      4.0
2     B+      2.7
3      C      2.3
4     A-      3.7
5      D      1.3
6      B      3.3
7     B-      3.0
8     C+      1.7
9     C-      2.0
10    D+      1.0
11     F      0.0
> str(filtered_data)
'data.frame':   11 obs. of  2 variables:
 $ Grade   : Factor w/ 11 levels "A","A-","B","B-",..: 1 5 6 2 9 3 4 8 7 10 ...
 $ num.char: chr  "4.0" "2.7" "2.3" "3.7" ...

矢量的值不需要是字符。您可以使用如下命名数​​字向量来skp所有as.character.as.numeric folderol:

> trans.vec.num=  c("A"=4.0, "A-"=3.7,
+    "B+" = 3.3, "B" = 3.0,
+    "B-" = 2.7, "C+" = 2.3,
+    "C" = 2.0, "C-" = 1.7,
+    "D+" = 1.3, "D" = 1.0,
+    "F" = 0.0)
> filtered_data$num.num <- trans.vec.num[filtered_data$Grade]
> str(filtered_data)
'data.frame':   11 obs. of  3 variables:
 $ Grade   : Factor w/ 11 levels "A","A-","B","B-",..: 1 5 6 2 9 3 4 8 7 10 ...
 $ num.char: chr  "4.0" "2.7" "2.3" "3.7" ...
 $ num.num : num  4 2.7 2.3 3.7 1.3 3.3 3 1.7 2 1 ...

请注意,原始成绩列是一个因素,但没有打扰“[”-function。