sample_n用于R中的不同大小

时间:2015-11-27 09:30:02

标签: r random-sample

我正在尝试sample_n按年龄组(Bage),性别和就业来创建一个有种族的新专栏。我找到了一种方法,但是对于每个样本,有9行代码,每次都会改变大小,因为我根据他们的族群分配不同数量的人。

以下示例显示了16-24岁年龄组中失业男性的随机分布代码,其中人口普查定义为“其他”。示例数据取自完整数据集。在此之后,我将重复所有就业类型和种族的所有代码(改变细节;性格,性别,就业,规模),因此这是一个漫长而缓慢的过程。我已经看过创建循环或函数,但由于我需要不同大小的样本,而不是整个数据集中的相同样本大小,因此我并没有真正陷入困境。

任何有关减少代码长度和时间的建议都将非常感激。

示例输入数据:显示年龄组16-24(Bage == 16),以及某些就业类型的男性:

       ID    Ages    Bage  Gender     Employment   Ethnicity
77     16     16     16     Male           PT          
78     78     16     16     Male           PT          
79     79     16     16     Male           PT          
80     80     16     16     Male           PT           
81     81     16     16     Male           PT          
82     82     16     16     Male           PT          
83     83     16     16     Male           PT                  
91     91     16     16     Male           PT          
92     92     16     16     Male           PT          
93     93     16     16     Male           PT          
94     94     16     16     Male           PT     
95     95     16     16     Male           PT     
96     96     16     16     Male           PT     
97     97     16     16     Male           PT     
98     98     16     16     Male           PT     
99     99     16     16     Male           PT     
100   100     16     16     Male           PT     
101   101     16     16     Male           PT     
102   102     16     16     Male           PT        
127   127     16     16     Male           FT     
128   128     16     16     Male           FT     
129   129     16     16     Male           FT     
130   130     16     16     Male           FT     
131   131     16     16     Male           FT     
132   132     16     16     Male           FT     
133   133     16     16     Male           FT     
134   134     16     16     Male           FT     
135   135     16     16     Male           FT     
136   136     16     16     Male         SEFT     
137   137     16     16     Male           UN     
138   138     16     16     Male           UN     
139   139     16     16     Male           UN     
140   140     16     16     Male           UN     
141   141     16     16     Male           UN     
142   142     16     16     Male           UN     
143   143     16     16     Male           UN     
...   ...     ..     ..     ...            ..  

当前代码:

UNOTH=sample_n(EdUNAS[EdUNAS$Bage=="16" & EdUNAS$Gender=="Male" & EdUNAS$Employment=="UN" & EdUNAS$Ethnic=="0",],size=1, replace=FALSE)
UNOTH["Ethnic"]="Other"
Edunoth=merge(EdUNAS, UNOTH, by = "ID", all = TRUE)
Edunoth$Bage.x.x.y=NULL
Edunoth$Ages.x.x.y=NULL
Edunoth$Gender.x.x.y=NULL
Edunoth$Employment.x.x.y=NULL
Edunoth[is.na(Edunoth)] = ''
EdUNOTH=unite(Edunoth, Ethnic, Ethnic.x:Ethnic.y, sep='')

通缉输出:种族专栏根据人口普查数据填写的比例填写。

       ID    Ages    Bage  Gender     Employment   Ethnicity
77     16     16     16     Male           PT        White
78     78     16     16     Male           PT        White  
79     79     16     16     Male           PT        White
80     80     16     16     Male           PT        White 
81     81     16     16     Male           PT        White  
82     82     16     16     Male           PT        White  
83     83     16     16     Male           PT        Asian          
91     91     16     16     Male           PT        White  
92     92     16     16     Male           PT        White  
93     93     16     16     Male           PT        Other  
94     94     16     16     Male           PT        White
95     95     16     16     Male           PT        White
96     96     16     16     Male           PT        White
97     97     16     16     Male           PT        White
98     98     16     16     Male           PT        Asian
99     99     16     16     Male           PT        White
100   100     16     16     Male           PT        White
101   101     16     16     Male           PT        White
102   102     16     16     Male           PT        White
127   127     16     16     Male           FT        White
128   128     16     16     Male           FT        White
129   129     16     16     Male           FT        White
130   130     16     16     Male           FT        White
131   131     16     16     Male           FT        White
132   132     16     16     Male           FT        White
133   133     16     16     Male           FT        White
134   134     16     16     Male           FT        White
135   135     16     16     Male           FT        White
136   136     16     16     Male         SEFT        White
137   137     16     16     Male           UN        White
138   138     16     16     Male           UN        White
139   139     16     16     Male           UN        White
140   140     16     16     Male           UN        White
141   141     16     16     Male           UN        Asian
142   142     16     16     Male           UN        White
143   143     16     16     Male           UN        White
...   ...     ..     ..     ...            ..        ...

0 个答案:

没有答案