头(奥斯汀脉冲) #小动作:6 x 149
#!/usr/bin/env python
from tika import parser
parsed = parser.from_file('/path/to/file')
print(parsed["metadata"]) #To get the meta data of the file
print(parsed["content"]) # To get the content of the file
我的性别变量曾经是整数类型。通过使用以下代码,我将性别变量变成一个因素
Respondent Employed StayHome R1_Cost Gender
<int> <int> <fct> <int>
1 1 2 1 1 1
2 1 2 1 1 2
3 2 2 1 1 1
4 1 2 1 0 1
AustinParents$Gender = factor(AustinParents$Gender, levels = c(1, 2, 3, 4), labels = c("Male", "Female", "Prefer Not to say", "Other"))
在行中,我仍然看到女性,男性等。有人可以告诉我我做错了什么吗?更重要的是,您能告诉我如何解决吗?我的Race变量也出现了同样的问题。所有其他变量都很好。
我在查看器的行和列中看到了
受访者使用StayHome R1_Cost性别
summary(AustinParents$Gender)
Male Female Prefer Not to say Other NA's
0 0 0 0 392
答案 0 :(得分:1)
当我移除levels
时,对我来说效果很好:
假设Gender
是:
Gender<-c("Male","Male","Male","Female","Female",
"Prefer Not to say","Prefer Not to say",
"Other","Other")
然后将其用于:
Gender <- factor(Gender,
labels = c("Male", "Female", "Prefer Not to say", "Other"))
summary(Gender)