我正在分析R中的CHFLS数据集,该数据集位于库HSAUR2中。我想为这个数据拟合线性模型,以找出其他变量对变量R_happy的影响;已对R_happy进行编码,使得1表示“非常高兴”,否则为0。我只是想知道我如何编码其余的变量,例如,R_region作为数字,所以我可以使用虚拟变量并拟合线性模型?我尝试过使用as.numeric但它没有用。我的代码如下:
加载必要的库
library("HSAUR2") #Load necessary library
data(CHFLS,package="HSAUR2") #Load the Chinese Health and Family Life Survey data
View(CHFLS) #Read details about the data, including the covariates.
help("CHFLS")
summary(CHFLS) #Produce a summary of the data
#Pie chart showing womens self reported happiness
slices <- c(280, 1254)
lbls <- c("Very happy (18.25%)", "Otherwise(81.75%)")
pie(slices, labels=lbls)
#Define the variable of interest to be y which is 1 when
#"Very happy" (or greater) and 0 otherwise
y<-(CHFLS$R_happy>="Very happy")
# Append y onto the data and call the new data CHFLSnew
CHFLSnew<-cbind(CHFLS,y)
# Ensure that any categorical variables are coded as factors.
CHFLSnew$y<-as.factor(CHFLSnew$y)
##Append y as factor onto CHFLSnew
CHFLSnew<-cbind(CHFLS,y)
答案 0 :(得分:0)
一般情况下,如果您想将factor
转换为numeric
:
f <- factor(1:10)
f
[1] 1 2 3 4 5 6 7 8 9 10
Levels: 1 2 3 4 5 6 7 8 9 10
n <- as.numeric(levels(f)[f])
n
[1] 1 2 3 4 5 6 7 8 9 10