我有一个包含两个变量的数据框,性别和城镇(Df1)。我想计算性别的比值比(女性= 1),我想每个城镇这样做,这样我最终得到Df1的三个比值比。
我的实际数据集包含更多城镇,所以我想知道是否有更普遍的方式来做这个,而不是手动输入观察数量到:: oledsratio()?
谢谢!
起点(df):
Df1 <- data.frame(gender=c("m","m","m","f","f","f","m","m","m","f","m","f","m","f","f","f","f","f","f","f"), town=c("ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ma","ma","ma","ma","ma","ma"))
到目前为止代码:
library(epitools)
Df2 <- matrix(c(12,20,8,20),byrow=TRUE,ncol=2)
dimnames(Df2) <- list(Group=c("females","males"),MI=c("subtotal","total"))
oddsratio(Df2)
注意:优势比率(字面意思是两个赔率之间的比率)
假设10名男性中有7名被录取:p = 0.7,q = 1-0.7 = 0.3
假设10名女性中有3名被录取:p = 0.3,q = 1-0,3 = 0.7
男性入学赔率:0.7 / 0.3 = 2.333(被录取/未被录取)
女性的入学赔率:0.3 / 0.7 = 0.429
入院的比值比:OR = 2.333 / 0.429 = 5.44,
即男性入院的几率是女性的5.44倍。
答案 0 :(得分:0)
这样的东西?
library(tidyverse)
Df1 <- data.frame(gender=c("m","m","m","f","f","f","m","m","m","f","m","f","m","f","f","f","f","f","f","f"), town=c("ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ma","ma","ma","ma","ma","ma"))
Df1 %>% group_by(town) %>% summarise(
p_males = sum(gender == "m")/n(),
p_females = sum(gender == "f")/n(),
odds_males = p_males/p_females,
odds_females = p_females/p_males,
odds_ratio = odds_males/odds_females)