我正在使用以下代码创建一个带有两个分类变量的交叉表:
library(dplyr)
library(reshape2)
T1.1<-table(data$Q7_1,data$Q9,exclude = NULL)
T1.1<-data.frame(T1.1)
T1.2<-dcast(T1.1, Var1~Var2)
T1.2<-T1.2%>%mutate(Industry=as.character(Var1),Total_responses=A+B+C)%>%select(Industry,A,B,C,Total_responses)
C<-c("Industry"="ALL", colSums(T1.2[,2:5]))
T1.2<-rbind(C,T1.2)
这给出了输出:
Industry A B C Total_responses
1 ALL 20 18 18 56
2 Banking/Financial Services 2 2 2 6
3 Chemicals 0 1 2 3
4 Consumer Goods 1 1 1 3
5 Energy 2 1 0 3
6 High Tech 6 0 2 8
7 Insurance/Reinsurance 0 2 0 2
8 Life Sciences 0 0 0 0
9 Logistics 0 0 2 2
10 Mining & Metals 1 1 1 3
11 Other Manufacturing 1 2 0 3
12 Other Non-Manufacturing 3 2 2 7
13 Retail & Wholesale 1 1 0 2
14 Services (Non-Financial) 2 4 5 11
15 Transportation Equipment 1 1 1 3
16 <NA> 0 0 0 0
此输出没问题,但问题是在我使用table()函数之后,我将其转换为数据框,然后使用dcast获得所需的表格外观。在dcast之后,它创建了另一个NA,我不想要。
此外,我想使用这整个计算来创建一个函数,我可以将其用于更多级别的其他因素。
Q9有3个级别A,B和C,我不想像这样计算总响应,我想创建可以与任何其他具有不同级别数的因子一起使用的函数。请建议任何其他有效的方法。
> dput(data)
structure(list(Q7_1 = structure(c(5L, 5L, 14L, 1L, 9L, 13L, 1L,
3L, 13L, 13L, 13L, 12L, 2L, 11L, 13L, 10L, 11L, 1L, 4L, 5L, 5L,
4L, 5L, 9L, 2L, 4L, 13L, 10L, 13L, 13L, 11L, 1L, 11L, 5L, NA,
1L, 9L, 3L, 1L, 5L, NA, 2L, NA, 6L, 14L, NA, NA, 14L, 8L, 11L,
8L, 12L, 13L, NA, 3L, 11L, 11L, NA, 10L, 6L, 5L, 13L, 13L), .Label = c("Banking/Financial Services",
"Chemicals", "Consumer Goods", "Energy", "High Tech", "Insurance/Reinsurance",
"Life Sciences", "Logistics", "Mining & Metals", "Other Manufacturing",
"Other Non-Manufacturing", "Retail & Wholesale", "Services (Non-Financial)",
"Transportation Equipment"), class = "factor"), Q9 = structure(c(1L,
3L, 3L, 3L, 3L, 1L, 1L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 2L, 1L,
3L, 1L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 1L, 1L, 3L, 3L, 3L, 2L, 2L,
1L, NA, 1L, 1L, 2L, 2L, 1L, NA, 2L, NA, 2L, 2L, NA, NA, 1L, 3L,
1L, 3L, 1L, 3L, NA, 1L, 3L, 1L, NA, 2L, 2L, 3L, 3L, 2L), .Label = c("A",
"B", "C"), class = "factor")), class = c("data.table", "data.frame"
), row.names = c(NA, -63L), .Names = c("Q7_1",
"Q9"))
&GT;
答案 0 :(得分:1)
要将表格转换为数据框,我们可以使用as.data.frame.matrix()
。
crossCalc <- function(data){
t <- table(data$Q7_1, data$Q9)
t <- as.data.frame.matrix(t)
Total_responses <- with(t, A + B + C)
t <- cbind(t, Total_responses)
t <- rbind(ALL=colSums(T1.1), T1.1)
return(t)
}
crossCalc(data)
# A B C Total_responses
# ALL 20 18 18 56
# Banking/Financial Services 2 2 2 6
# Chemicals 0 1 2 3
# Consumer Goods 1 1 1 3
# Energy 2 1 0 3
# High Tech 6 0 2 8
# Insurance/Reinsurance 0 2 0 2
# Life Sciences 0 0 0 0
# Logistics 0 0 2 2
# Mining & Metals 1 1 1 3
# Other Manufacturing 1 2 0 3
# Other Non-Manufacturing 3 2 2 7
# Retail & Wholesale 1 1 0 2
# Services (Non-Financial) 2 4 5 11
# Transportation Equipment 1 1 1 3
也许这就是你想要的?