我正在尝试sample_n
按年龄组(Bage),性别和就业来创建一个有种族的新专栏。我找到了一种方法,但是对于每个样本,有9行代码,每次都会改变大小,因为我根据他们的族群分配不同数量的人。
以下示例显示了16-24岁年龄组中失业男性的随机分布代码,其中人口普查定义为“其他”。示例数据取自完整数据集。在此之后,我将重复所有就业类型和种族的所有代码(改变细节;性格,性别,就业,规模),因此这是一个漫长而缓慢的过程。我已经看过创建循环或函数,但由于我需要不同大小的样本,而不是整个数据集中的相同样本大小,因此我并没有真正陷入困境。
任何有关减少代码长度和时间的建议都将非常感激。
示例输入数据:显示年龄组16-24(Bage == 16),以及某些就业类型的男性:
ID Ages Bage Gender Employment Ethnicity
77 16 16 16 Male PT
78 78 16 16 Male PT
79 79 16 16 Male PT
80 80 16 16 Male PT
81 81 16 16 Male PT
82 82 16 16 Male PT
83 83 16 16 Male PT
91 91 16 16 Male PT
92 92 16 16 Male PT
93 93 16 16 Male PT
94 94 16 16 Male PT
95 95 16 16 Male PT
96 96 16 16 Male PT
97 97 16 16 Male PT
98 98 16 16 Male PT
99 99 16 16 Male PT
100 100 16 16 Male PT
101 101 16 16 Male PT
102 102 16 16 Male PT
127 127 16 16 Male FT
128 128 16 16 Male FT
129 129 16 16 Male FT
130 130 16 16 Male FT
131 131 16 16 Male FT
132 132 16 16 Male FT
133 133 16 16 Male FT
134 134 16 16 Male FT
135 135 16 16 Male FT
136 136 16 16 Male SEFT
137 137 16 16 Male UN
138 138 16 16 Male UN
139 139 16 16 Male UN
140 140 16 16 Male UN
141 141 16 16 Male UN
142 142 16 16 Male UN
143 143 16 16 Male UN
... ... .. .. ... ..
当前代码:
UNOTH=sample_n(EdUNAS[EdUNAS$Bage=="16" & EdUNAS$Gender=="Male" & EdUNAS$Employment=="UN" & EdUNAS$Ethnic=="0",],size=1, replace=FALSE)
UNOTH["Ethnic"]="Other"
Edunoth=merge(EdUNAS, UNOTH, by = "ID", all = TRUE)
Edunoth$Bage.x.x.y=NULL
Edunoth$Ages.x.x.y=NULL
Edunoth$Gender.x.x.y=NULL
Edunoth$Employment.x.x.y=NULL
Edunoth[is.na(Edunoth)] = ''
EdUNOTH=unite(Edunoth, Ethnic, Ethnic.x:Ethnic.y, sep='')
通缉输出:种族专栏根据人口普查数据填写的比例填写。
ID Ages Bage Gender Employment Ethnicity
77 16 16 16 Male PT White
78 78 16 16 Male PT White
79 79 16 16 Male PT White
80 80 16 16 Male PT White
81 81 16 16 Male PT White
82 82 16 16 Male PT White
83 83 16 16 Male PT Asian
91 91 16 16 Male PT White
92 92 16 16 Male PT White
93 93 16 16 Male PT Other
94 94 16 16 Male PT White
95 95 16 16 Male PT White
96 96 16 16 Male PT White
97 97 16 16 Male PT White
98 98 16 16 Male PT Asian
99 99 16 16 Male PT White
100 100 16 16 Male PT White
101 101 16 16 Male PT White
102 102 16 16 Male PT White
127 127 16 16 Male FT White
128 128 16 16 Male FT White
129 129 16 16 Male FT White
130 130 16 16 Male FT White
131 131 16 16 Male FT White
132 132 16 16 Male FT White
133 133 16 16 Male FT White
134 134 16 16 Male FT White
135 135 16 16 Male FT White
136 136 16 16 Male SEFT White
137 137 16 16 Male UN White
138 138 16 16 Male UN White
139 139 16 16 Male UN White
140 140 16 16 Male UN White
141 141 16 16 Male UN Asian
142 142 16 16 Male UN White
143 143 16 16 Male UN White
... ... .. .. ... .. ...