我需要按年龄和婚姻状况计算个人的频率,所以通常我会使用:
table(age, marital_status)
然而,在采样数据后,每个人的体重都不同。如何将其纳入我的频率表?
答案 0 :(得分:15)
您可以使用包svytable
中的功能survey
或wtd.table
中的rgrs
。
编辑: rgrs
现在称为questionr
:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(questionr)
wtd.table(x = df$var, weights = df$wt)
# A B
# 40 60
dplyr
:
library(dplyr)
count(x = df, var, wt = wt)
# # A tibble: 2 x 2
# var n
# <fctr> <dbl>
# 1 A 40
# 2 B 60
答案 1 :(得分:5)
为了完整起见,使用基础 R:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
aggregate(x = list("wt" = df$wt), by = list("var" = df$var), FUN = sum)
<块引用>
var wt
1 A 40
2 B 60
或者使用不那么麻烦的公式符号:
aggregate(wt ~ var, data = df, FUN = sum)
<块引用>
var wt
1 A 40
2 B 60
答案 2 :(得分:2)
软件包expss
中的另一个解决方案:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(expss)
fre(df$var, weight = df$wt)
| df$var | Count | Valid percent | Percent | Responses, % | Cumulative responses, % |
| ------ | ----- | ------------- | ------- | ------------ | ----------------------- |
| A | 40 | 40 | 40 | 40 | 40 |
| B | 60 | 60 | 60 | 60 | 100 |
| #Total | 100 | 100 | 100 | 100 | |
| <NA> | 0 | | 0 | | |
答案 3 :(得分:0)
您还可以使用freqweights包中的tablefreq:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(freqweights)
tablefreq(df, "var", "wt")
A tibble: 2 x 2
var freq
<fct> <dbl>
1 A 40
2 B 60
答案 4 :(得分:0)
您可以使用data.table
:
# using the same data as Victorp
setDT(df)[, .(n = sum(wt)), var]
var n
1: A 40
2: B 60
答案 5 :(得分:0)
使用包权重和函数 wpct
require(weights)
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
wpct(df$var, df$wt)
A B
0.4 0.6