我有一个R数据帧,其中一列是其级别具有隐式排序的因子。 如何按以下方式将因子级别转换为特定整数:
例如,这是我的数据框:
agree <- c("Strongly agree", "Somewhat disagree", "Somewhat agree",
"Neutral", "Strongly agree", "Strongly disagree", "Neutral")
age <- c(41, 35, 29, 42, 31, 22, 58)
df <- data.frame(age, agree)
df
# age agree
# 1 41 Strongly agree
# 2 35 Somewhat disagree
# 3 29 Somewhat agree
# 4 42 Neutral
# 5 31 Strongly agree
# 6 22 Strongly disagree
# 7 58 Neutral
str(df)
# 'data.frame': 7 obs. of 2 variables:
# $ age : num 41 35 29 42 31 22 58
# $ agree: Factor w/ 5 levels "Neutral","Somewhat agree",..: 4 3 2 1 4 5 1
现在,我想使用上面显示的映射将agree
列转换为整数列。
我已经搜索了关于将因子转换为整数的其他问题,但它们与维持因子排序无关。
&#34; How to convert a factor to an integer\numeric without a loss of information?&#34;
&#34; Convert factor to integer&#34;
&#34; Convert factor to integer in a data frame&#34;
答案 0 :(得分:6)
您需要首先定义因子的顺序:
ordering <- c("Strongly disagree", "Somewhat disagree", "Neutral", "Somewhat agree", "Strongly agree")
然后,当您第一次创建因子时,您应该使用该定义:
agreeFactor <- factor(agree, levels = ordering)
然后,您应该能够获得有序因素:
as.numeric(agreeFactor)
您也可以在使用as.numeric()时应用订单,但如果您决定稍后检索数字向量并忘记应用“levels =”参数,则会导致不一致。
e:如果要将数字直接导入数据框,只需使用:
df$agree <- as.numeric(factor(df$agree, levels = ordering))
答案 1 :(得分:1)
dplyr库对此类操作有一个有用的revalue
函数:
library(plyr)
df$agree<-as.numeric( revalue(df$agree, c("Strongly disagree" = 1,
"Somewhat disagree" = 2,
"Neutral" = 3,
"Somewhat agree" = 4,
"Strongly agree" = 5)) )
排序因子的整体@tluh方法是一种更好的方法,因为它维护原始输入并将因子设置为正确的顺序。
答案 2 :(得分:0)
如果您的因子已经按级别排序,则可以使用以下函数将该因子转换为数字顺序。
Convert_Numeric = function(X) {
L = levels(X)
Y = as.numeric(factor(X, labels = seq(1:length(L))))
return(Y)
}
这对于函数或dplyr尤其有用:
df %>%
mutate(Numeric_version = Convert_Numeric(agree))