我有一个Stata dta
原始数据文件,其中包含一个…
的字符串向量。使用foreign
包导入R后,我的数据如下所示:
# dput(dat[1:3, 218])
# c("", "I want very much\xc9will do whatever it takes", "I want very much\xc9will do my fair share"
对于这个例子,我将创建一个名为test
的对象:
test <- c("", "I want very much\xc9will do whatever it takes", "I want very much\xc9will do my fair share")
我想将test
转换为一个因子,但我只是得到了所有的NA。我尝试了两种方法:
factor(test,
levels=c("I want very much\\xc9will do whatever it takes",
"I want very much\\xc9will do my fair share"),
labels=c(1, 2))
# [1] <NA> <NA> <NA>
# Levels: 1 2
factor(test,
levels=c("I want very much…will do whatever it takes",
"I want very much…will do my fair share"),
labels=c(1, 2))
# [1] <NA> <NA> <NA>
# Levels: 1 2
我知道我可以编辑dta
文件,但我不想触摸原始数据。我还能尝试什么?
最后,我想要以下内容:
#[1] <NA> 1 2
#Levels: 1 2
答案 0 :(得分:1)
请勿使用\\
来逃避您的特殊角色。这有效:
factor(test,
levels=c("I want very much\xc9will do whatever it takes",
"I want very much\xc9will do my fair share"),
labels=c(1, 2))
#[1] <NA> 1 2
#Levels: 1 2
答案 1 :(得分:0)
test <- c(NA, "I want very much\xc9will do my fair share", "I want very much\xc9will do whatever it takes")
ana <- as.factor(test)
library(plyr)
bob <- revalue(ana, c("I want very much\xc9will do my fair share" = "1",
"I want very much\xc9will do whatever it takes" = "2"))
bob
这对你有用吗?
答案 2 :(得分:0)
从查看您的预期输出,可能是:
factor(as.vector(setNames(1:2,unique(test[test!='']))[test]))
#[1] <NA> 1 2
#Levels: 1 2
从@ thelatemail的回复中注意到,levels
与test
字符串不匹配。例如。
test1 <- c("", "I want very much\\xc9will do whatever it takes", "I want very much\\xc9will do my fair share") #using `\\`
factor(test1, levels= unique(test1[test1!='']), labels=1:2)
#[1] <NA> 1 2
#Levels: 1 2
如果你这样做:
factor(test1, levels= unique(test[test!='']), labels=1:2)
#[1] <NA> <NA> <NA>
#Levels: 1 2