我有一个数据集,其中有一些要分配的未知类别的值。在下面创建的示例中,我查看了两年(1991年和1992年)来自四所房子(格兰芬多,斯莱特林,乌鸦爪和赫奇帕奇)的霍格沃茨学生的数量,这两个性别分别拥有三种宠物(蟾蜍,猫头鹰)或cat)。例如,第一行说,在1991年,有两名男格兰芬多学生拥有蟾蜍。
但是,我也有不知道性别和/或宠物的情况。例如,我知道1991年在格兰芬多(Gryffindor)有10名学生拥有蟾蜍,但我不知道它们是什么性别。我想根据1991年拥有蟾蜍的男女格兰芬多学生的比例,将这10名学生分配为两种性别。在格兰芬多1991年拥有性别的已知蟾蜍总数中,男性拥有25%的蟾蜍(2/8)和雌性拥有75%的蟾蜍(6/8),因此我将十只蟾蜍中的25%添加到男性数量中(使1991年拥有蟾蜍的雄性格兰芬多的总数达到4.5)。十只蟾蜍中有75%属于雌性蟾蜍(1991年拥有蟾蜍的雌性格兰芬多总数为13.5)。
在我知道性别但不知道所养宠物的类型的情况下,以及在我知道年份和房屋而不知道性别或所养宠物的类型的情况下,我会使用相同的逻辑。
DF <- read.table(text = "year house gender pet count
1991 gryffindor male toad 2
1991 gryffindor male owl 4
1991 gryffindor male cat 0
1991 gryffindor female toad 6
1991 gryffindor female owl 6
1991 gryffindor female cat 4
1991 gryffindor unknown toad 10
1991 gryffindor unknown owl 2
1991 gryffindor unknown cat 4
1991 gryffindor male unknown 20
1991 gryffindor female unknown 16
1991 gryffindor unknown unknown 12
1991 slytherin male toad 4
1991 slytherin male owl 2
1991 slytherin male cat 2
1991 slytherin female toad 6
1991 slytherin female owl 2
1991 slytherin female cat 4
1991 slytherin unknown toad 2
1991 slytherin unknown owl 4
1991 slytherin unknown cat 4
1991 slytherin male unknown 22
1991 slytherin female unknown 14
1991 slytherin unknown unknown 14
1991 hufflepuff male toad 2
1991 hufflepuff male owl 2
1991 hufflepuff male cat 0
1991 hufflepuff female toad 0
1991 hufflepuff female owl 3
1991 hufflepuff female cat 4
1991 hufflepuff unknown toad 4
1991 hufflepuff unknown owl 2
1991 hufflepuff unknown cat 4
1991 hufflepuff male unknown 28
1991 hufflepuff female unknown 10
1991 hufflepuff unknown unknown 12
1991 ravenclaw male toad 2
1991 ravenclaw male owl 4
1991 ravenclaw male cat 2
1991 ravenclaw female toad 6
1991 ravenclaw female owl 8
1991 ravenclaw female cat 8
1991 ravenclaw unknown toad 2
1991 ravenclaw unknown owl 2
1991 ravenclaw unknown cat 4
1991 ravenclaw male unknown 16
1991 ravenclaw female unknown 18
1991 ravenclaw unknown unknown 14
1992 gryffindor male toad 2
1992 gryffindor male owl 4
1992 gryffindor male cat 8
1992 gryffindor female toad 2
1992 gryffindor female owl 2
1992 gryffindor female cat 4
1992 gryffindor unknown toad 2
1992 gryffindor unknown owl 4
1992 gryffindor unknown cat 4
1992 gryffindor male unknown 20
1992 gryffindor female unknown 14
1992 gryffindor unknown unknown 12
1992 slytherin male toad 2
1992 slytherin male owl 4
1992 slytherin male cat 0
1992 slytherin female toad 6
1992 slytherin female owl 2
1992 slytherin female cat 4
1992 slytherin unknown toad 2
1992 slytherin unknown owl 2
1992 slytherin unknown cat 4
1992 slytherin male unknown 20
1992 slytherin female unknown 16
1992 slytherin unknown unknown 12
1992 hufflepuff male toad 2
1992 hufflepuff male owl 4
1992 hufflepuff male cat 0
1992 hufflepuff female toad 6
1992 hufflepuff female owl 8
1992 hufflepuff female cat 4
1992 hufflepuff unknown toad 4
1992 hufflepuff unknown owl 2
1992 hufflepuff unknown cat 4
1992 hufflepuff male unknown 22
1992 hufflepuff female unknown 18
1992 hufflepuff unknown unknown 12
1992 ravenclaw male toad 2
1992 ravenclaw male owl 6
1992 ravenclaw male cat 0
1992 ravenclaw female toad 6
1992 ravenclaw female owl 2
1992 ravenclaw female cat 4
1992 ravenclaw unknown toad 2
1992 ravenclaw unknown owl 2
1992 ravenclaw unknown cat 8
1992 ravenclaw male unknown 10
1992 ravenclaw female unknown 20
1992 ravenclaw unknown unknown 14",
header = TRUE, stringsAsFactors = FALSE)
关于在R中最有效的编码方式的任何建议吗?