Question

我有一个加权的调查数据集，涉及年龄组，收入和支出。我想找到不同年龄段和收入范围内支出的平均值。

例如

DF:
Age   Income     Spending1 Spending2  Weight 
45-49 1000       50        35          100
30-39 2000       40        60          150
40-44 3434       30        55          120

目前，我已经对此进行了编码：

  DF$hhdecile<-weighted_ntile(DF$Income, weights=DF$Weight, 5)

  Result1<- DF %>% group_by(Age,hhdecile) %>% dplyr::summarise(mean.exp = weighted.mean(x = Spending1, w = Weight))

  Result2<- DF %>% group_by(Age,hhdecile) %>% dplyr::summarise(mean.exp = weighted.mean(x = Spending2, w = Weight))



df.list <- list(Result1=Result1,
                Result2=Result2)

names(df.list$Result1)[names(df.list$Result1)=="mean.exp"]<- Result1

ResultJoined < - df.list %>% reduce(full_join, by=c('Age','hhdecile')

找到了与所有年龄段的人口相比五分之一的人群，而我对与年龄段相比的五分位数感兴趣。

是否可以使用group_by或类似方法对每个年龄组分别执行加权百分位数功能？

（实际上有15种支出类别）

分组加权调查功能

0 个答案: