Question

我有一个大型数据集，我需要为其生成多个交叉表。这些特别是二维表，用于生成频率以及均值和SD。

举个例子，我有以下数据 -

City <- c("A","B","A","A","B","C","D","A","D","C")
Q1 <- c("Agree","Agree","Agree","Agree","Agree","Neither","Neither","Disagree","Agree","Agree")
df <- data.frame(City,Q1)

记住数据，我想生成一个带有平均值的交叉表 -

    City            
        A   B   C   D
Agree   3   2   1   1
Neither         1   1
Disagree    1           
Total   4   2   2   2
Mean    2.5 3   2.5 2.5

当生成均值时，同意的权重为3，权重不给2，不一致给定权重为1.交叉表输出的均值应低于总列数。在每列和每行之间都有网格线。

您能否建议如何在R？

中实现这一目标

Answer 1

这是一个解决方案：

x <- table(df$Q1, df$City) #building basic crosstab
#assigning weights to vector
weights <- c("Agree" = 3, "Disagree" = 1, "Neither" = 2)
#getting weighted mean
weightedmean <- apply(x, 2, function(x) {sum(x * weights)/sum(x)})
#building out table
x <- rbind(x,
           apply(x, 2, sum), #row sums
           weightedmean)
rownames(x)[4:5] <- c("Total", "Mean")

Answer 2

以下是使用addmargins的可能解决方案，该解决方案允许您将预定义函数传递到table结果

wm <- function(x) sum(x * c(3, 1, 2)) / sum(x)
addmargins(table(df[2:1]), 1, list(list(Total = sum, Mean = wm)))

#           City
# Q1           A   B   C   D
#   Agree    3.0 2.0 1.0 1.0
#   Disagree 1.0 0.0 0.0 0.0
#   Neither  0.0 0.0 1.0 1.0
#   Total    4.0 2.0 2.0 2.0
#   Mean     2.5 3.0 2.5 2.5

如果您想要SD，只需将, SD = sd添加到功能列表

即可

在R中生成具有均值和SD的交叉表

2 个答案: