我想根据各种因素以其置信区间计算各种索引,并使用ggplot2
在图表中显示。
在列1=positif
和0=negatif;
"individual=1"
中,表示有1个人被测试。
必须根据species+population+pathogen+dpi
...
示例:AL: yu: dv: 21dpi infectrate =(2/3)*100; dissemrate = (2/2)*100;
transrate = (2/2)*100; st=(220+100)/2 ##mean for the population, the
pathogen and the dpi
AL: ti dv: 21dpi infectrate = (2/4)*100
infectrate = (number positif/number of individuals tested)*100;
dissemrate = (number positif$dissem/number positif$infect)*100;
transrate = (number positif$trans/number positif$dissem)*100;
strate = mean($st);
species population individual pathogen dpi infect dissem trans st
AL yu 1 dv 21 1 1 1 220
AL yu 2 dv 21 1 1 1 100
AL yu 3 dv 21 0 0 0 0
AL ti 1 dv 21 0 0 0 0
AL ti 2 dv 21 1 1 1 60
AL ti 3 dv 21 1 1 0 0
AL ti 4 dv 21 0 0 0 0
AA dla 1 dv 21 1 1 1 180
AA dla 2 dv 21 1 1 0 0
AA dla 3 dv 21 1 1 1 360
AL yu 1 zk 21 0 0 0 0
AL yu 2 zk 21 0 0 0 0
AA mra 1 zk 14 1 1
AA mra 2 zk 14 1 1
AA yu 1 yv 21 0 0 0 0
AA yu 2 yv 21 1 1 0 0
AL bz 1 zk 14 1 1
AL bz 2 zk 14 1 1
I've tried to use the dplyr package, but I didn't succeed.
...
当我计算代码时,它为索引的所有总体给出相同的值。
需要任何帮助,谢谢。
答案 0 :(得分:0)
我不确定我是否完全理解这些计算。我认为这就是您要寻找的。 p>
library(tidyverse)
df <-
data.frame(stringsAsFactors=FALSE,
species = c("AL", "AL", "AL", "AL", "AL", "AL", "AL", "AA", "AA", "AA",
"AL", "AL", "AA", "AA", "AA", "AA", "AL", "AL"),
population = c("yu", "yu", "yu", "ti", "ti", "ti", "ti", "dla", "dla",
"dla", "yu", "yu", "mra", "mra", "yu", "yu", "bz", "bz"),
individual = c(1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 1, 2, 1, 2, 1, 2, 1, 2),
pathogen = c("dv", "dv", "dv", "dv", "dv", "dv", "dv", "dv", "dv", "dv",
"zk", "zk", "zk", "zk", "yv", "yv", "zk", "zk"),
dpi = c(21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 14, 14, 21,
21, 14, 14),
infect = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1),
dissem = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1),
trans = c(1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, NA, NA, 0, 0, NA, NA),
st = c(220, 100, 0, 0, 60, 0, 0, 180, 0, 360, 0, 0, NA, NA, 0, 0,
NA, NA)
)
# infectrate = (number positif/number of individuals tested)*100;
# dissemrate = (number positif$dissem/number positif$infect)*100;
# transrate = (number positif$trans/number positif$dissem)*100;
# strate = mean($st);
df %>%
group_by(species, population, pathogen, dpi) %>%
summarise(
infectrate = sum(infect)/n()*100,
dissemrate = ifelse(infectrate == 0, 0, sum(dissem)/sum(infect)*100),
transrate = ifelse(dissemrate == 0, 0, sum(trans)/sum(dissem)*100),
strate = mean(st)
) %>%
ungroup()
#> df
# A tibble: 7 x 8
# species population pathogen dpi infectrate dissemrate transrate strate
# <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 AA dla dv 21 100 100 66.7 180
#2 AA mra zk 14 100 100 NA NA
#3 AA yu yv 21 50 100 0 0
#4 AL bz zk 14 100 100 NA NA
#5 AL ti dv 21 50 100 50 15
#6 AL yu dv 21 66.7 100 100 107.
#7 AL yu zk 21 0 0 0 0