使用R,我的数据帧capstone3带有列Certificate ... HQA的级别如下:
levels(capstone3$Certificate...HQA)
[1] "CUM LAUDE" "DIPLOM"
[3] "DOCTORATE" "GRADUATE DIPLOMA"
[5] "HIGHEST HONS" "HONOURS (DISTINCTION)"
[7] "HONOURS (HIGHEST DISTINCTION)" "HONS"
[9] "HONS I" "HONS II"
[11] "HONS II LOWER" "HONS II UPPER"
[13] "HONS III" "HONS UNCLASSIFIED"
[15] "HONS WITH MERIT" "MAGNA CUM LAUDE"
[17] "MASTER'S DEGREE" "OTHER HONS"
[19] "PASS DEGREE" "PASS WITH CREDIT"
[21] "PASS WITH DISTINCTION" "PASS WITH HIGH MERIT"
[23] "PASS WITH MERIT" "SUMMA CUM LAUDE"
我编写了一个代码,以通过将级别[7]替换为[9],级别[6]替换为[12]等来减少级别数:
capstone3$Certificate...HQA <- as.factor(capstone3$Certificate...HQA)
capstone3$Certificate...HQA <- gsub("HONOURS (HIGHEST DISTINCTION)","HONS I", capstone3$Certificate...HQA)
capstone3$Certificate...HQA <- gsub("HONOURS (DISTINCTION)","HONS II UPPER", capstone3$Certificate...HQA)
capstone3$Certificate...HQA <- gsub("HONS WITH MERIT","HONS II LOWER", capstone3$Certificate...HQA)
但是上面的gsub代码没有替换列中的名称,请有人指出我的代码有问题吗?
答案 0 :(得分:2)
括号()
是在正则表达式中用于创建组的特殊字符。如果您有文字括号,则需要使用\\
gsub("HONOURS \\(HIGHEST DISTINCTION\\)","HONS I", capstone3$Certificate...HQA)
OR作为@ManuelBickel:使用fixed = TRUE
,该模式是一个字符串,将按原样进行匹配。
gsub("HONOURS (HIGHEST DISTINCTION)","HONS I", capstone3$Certificate...HQA, fixed = TRUE)