我有一个CSV文件,其中我有'等级'列,其中包含从'F'和'D +'到'A'(不是'A +')的条目(等级)。所以,我想要做的是将这些值从例如'A'转换为4.0(数值),'A-'转换为3.7(再次 - 数字)。
到目前为止,我尝试了 plyr 库中的重估值(),但它无效。
filtered_data$Grade <-
as.numeric(as.character(revalue(filtered_data$Grade,
+ c("A"="4.0", "A-"="3.7",
+ "B+" = "3.3", "B" = "3.0",
+ "B-" = "2.7", "C+" = "2.3",
+ "C" = "2.0", "C-" = "1.7",
+ "D+" = "1.3", "D" = "1.0",
+ "F" = "0.0"))))
Error in revalue(filtered_data$Grade, c(A = "4.0", `A-` = "3.7",
`B+` = "3.3", :
x is not a factor or a character vector.
我也试过用 as.numeric(as.character(foo))做一些技巧,但这也不行。
第三,“硬编码”方法也没有用,因为我试图实现 for loop 来更改列中的每个条目,但是我收到了这条消息
Warning message:
In `[<-.factor`(`*tmp*`, i, value = c(11L, 16L, 5L, 13L, 8L, 16L, :
invalid factor level, NA generated
提前致谢!
答案 0 :(得分:1)
将您的成绩列更改为一个因子将使用您的第一种方法:
filtered_data<-data.frame(Grade=c("A","B+", "C", "A-","D","B", "B-","C+","C-","D+","F"))
filtered_data$Grade <- as.factor(filtered_data$Grade)
filtered_data$Grade <- revalue(filtered_data$Grade,
c("A"="4.0", "A-"="3.7",
"B+" = "3.3", "B" = "3.0",
"B-" = "2.7", "C+" = "2.3",
"C" = "2.0", "C-" = "1.7",
"D+" = "1.3", "D" = "1.0",
"F" = "0.0"))
答案 1 :(得分:1)
我不确定您的错误发生在哪里,但我认为使用查找向量比使用新包和函数要简单得多:
> trans.vec= c("A"="4.0", "A-"="3.7",
+ "B+" = "3.3", "B" = "3.0",
+ "B-" = "2.7", "C+" = "2.3",
+ "C" = "2.0", "C-" = "1.7",
+ "D+" = "1.3", "D" = "1.0",
+ "F" = "0.0")
创建了一个命名向量。然后,您可以通过应用于该向量的提取函数来推送“成绩”列的值:
> filtered_data$num.char <- trans.vec[filtered_data$Grade]
> filtered_data
Grade num.char
1 A 4.0
2 B+ 2.7
3 C 2.3
4 A- 3.7
5 D 1.3
6 B 3.3
7 B- 3.0
8 C+ 1.7
9 C- 2.0
10 D+ 1.0
11 F 0.0
> str(filtered_data)
'data.frame': 11 obs. of 2 variables:
$ Grade : Factor w/ 11 levels "A","A-","B","B-",..: 1 5 6 2 9 3 4 8 7 10 ...
$ num.char: chr "4.0" "2.7" "2.3" "3.7" ...
矢量的值不需要是字符。您可以使用如下命名数字向量来skp所有as.character.as.numeric folderol:
> trans.vec.num= c("A"=4.0, "A-"=3.7,
+ "B+" = 3.3, "B" = 3.0,
+ "B-" = 2.7, "C+" = 2.3,
+ "C" = 2.0, "C-" = 1.7,
+ "D+" = 1.3, "D" = 1.0,
+ "F" = 0.0)
> filtered_data$num.num <- trans.vec.num[filtered_data$Grade]
> str(filtered_data)
'data.frame': 11 obs. of 3 variables:
$ Grade : Factor w/ 11 levels "A","A-","B","B-",..: 1 5 6 2 9 3 4 8 7 10 ...
$ num.char: chr "4.0" "2.7" "2.3" "3.7" ...
$ num.num : num 4 2.7 2.3 3.7 1.3 3.3 3 1.7 2 1 ...
请注意,原始成绩列是一个因素,但没有打扰“[”-function。