我最近在calculating net proportions for a table in R方面获得了帮助,但是尝试对其进行总结是没有用的,当我选择答案时,我需要发布一个新问题。
这是我的原始数据(我称为qf):
genre status rb wrb inn
Fiction FAILURE 621 66 1347
Fiction FAILURE 400 46 928
Fiction FAILURE 238 35 663
Poetry FAILURE 513 105 1732
Poetry FAILURE 165 47 393
Poetry FAILURE 896 193 2350
Love-story FAILURE 5690 501 8869
Love-story FAILURE 1284 174 2793
Love-story FAILURE 7279 715 13852
Love-story SUCCESS 18150 1734 39635
Poetry SUCCESS 1988 226 4712
Love-story SUCCESS 20110 2222 43953
Love-story SUCCESS 20762 2288 46706
Poetry SUCCESS 1824 322 3984
Poetry SUCCESS 1105 148 2751
Adventure SUCCESS 4675 617 8462
Adventure SUCCESS 7943 599 17247
Adventure SUCCESS 7290 601 17774
由于有了答案,我设法按类型和成功/失败总结了它(我喜欢跟踪所有转换,因此跟踪多个数据帧):
qf2 <- qf %>% group_by(genre,status) %>% summarise_all(sum)
qf3 <- ff2 %>% as.data.frame()
qf4 <- qf3 %>% mutate(rowSum = rowSums(.[,names(qf3)[3:5]])) %>%
group_by(genre) %>%
summarise_at(vars(names(qf3)[3:5]),
funs(net = .[status == "SUCCESS"]/rowSum[status == "SUCCESS"] -
.[status == "FAILURE"]/rowSum[status == "FAILURE"] )) %>%
as.data.frame()
但是我现在要做的是获取整体比例。但是,无论我尝试什么,都行不通。我想我缺少明显的东西。
我想要得到的是以下内容的输出:
Sum-FAILURE 0.329241738 0.036265536 0.634492726
Sum-SUCCESS 0.301794636 0.031519501 0.666685863
Net -0.027447103 -0.004746035 0.032193137
我试图创建的计算是(对于rb):
(Sum(success_rb)/(Sum(success_rb)+Sum(success_wrb)+Sum(Success_inn)) - (Sum(failure_rb)/(Sum(failure_rb)+Sum(failure_wrb)+Sum(failure_inn))
答案 0 :(得分:2)
qf %>%
select(-genre)%>%
group_by(status) %>%
summarise_all(sum)%>%
{.[-1]/rowSums(.[-1])}%>%
rbind(.[2,]-.[1,])
rb wrb inn
1 0.3292417 0.036265536 0.63449273
2 0.3017946 0.031519501 0.66668586
21 -0.0274471 -0.004746035 0.03219314
library(data.table)
setDT(qf)[,lapply(.SD,sum),status,.SDcols=3:5][,
.SD/rowSums(.SD),.SDcols=-1][,rbind(.SD,.SD[2]-.SD[1])]
rb wrb inn
1: 0.3292417 0.036265536 0.63449273
2: 0.3017946 0.031519501 0.66668586
3: -0.0274471 -0.004746035 0.03219314