我有一个名为carcom
的数据集,看起来像这样
carcom <- data.frame(household = c(173, 256, 256, 319, 319, 319, 422, 422, 422, 422), individuals= c(1, 1, 2, 1, 2, 3, 1, 2, 3, 4))
在个人中,父亲是“ 1”,母亲是“ 2”,孩子是“ 3”和“ 4”。我想得到两个新的专栏。第一个应指出该家庭中有多少孩子。第二,给每个人分别赋予权重,父亲分别为“ 1”,母亲为“ 0.5”,每个孩子为“ 0.3”。我的新数据集应如下所示
newcarcom <- data.frame(household = c(173, 256, 319, 422), child = c(0, 0, 1, 2), weight = c(1, 1.5, 1.8, 2.1)
几天来我一直在寻找解决方案。如果有人帮助我将不胜感激。谢谢
答案 0 :(得分:2)
我们可以计算每个individuals
中值为3和4的household
的数量。为了计算weight
,我们使用recode
将1:4的值更改为其相应的权重值,然后取sum
。
library(dplyr)
newcarcom <- carcom %>%
group_by(household) %>%
summarise(child = sum(individuals %in% 3:4),
weight = sum(recode(individuals,`1` = 1, `2` = 0.5, .default = 0.3)))
# household child weight
# <dbl> <int> <dbl>
#1 173 0 1
#2 256 0 1.5
#3 319 1 1.8
#4 422 2 2.1
@markus建议的Base R版本
newcarcom <- do.call(data.frame, aggregate(individuals ~ household, carcom, function(x)
c(child = sum(x %in% 3:4), weight = sum(replace(y <- x^-1, y < 0.5, 0.3)))))
答案 1 :(得分:0)
带有data.table
library(data.table)
setDT(carcom)[, .(child = sum(individuals %in% 3:4),
weight = sum(recode(individuals,`1` = 1, `2` = 0.5, .default = 0.3))), household]