我很难计算数据帧中的出现次数。我的数据如下:
animal food
1 horse carrot
2 bird seeds
3 monkey banana
4. horse hay
5 bird berries
6. horse seeds
我正试图弄清每种食物的动物分类。例如,我想发现马吃了60%的干草,而另外40%被鸟和猴子吃了。我该如何找到这些并将它们放在单独的数据框中?
新数据框应如下所示:
food horse bird monkey
1 carrot 60% 0% 40%
2 seeds 20% 60% 20%
3 banana 0% 0% 100%
4. berries 30% 50% 20%
5. hay 100% 0% 0%
百分比显然不正确,这只是一个例子。
答案 0 :(得分:5)
喜欢吗?
df <- data.frame(
stringsAsFactors = FALSE,
animal = c("horse","bird",
"monkey","horse","bird","horse"),
food = c("carrot","seeds",
"banana","hay","berries","seeds")
)
with(df, prop.table(table(food, animal), margin = 1)) * 100
animal
food bird horse monkey
banana 0 0 100
berries 100 0 0
carrot 0 100 0
hay 0 100 0
seeds 50 50 0
答案 1 :(得分:2)
您可以先计算总数:
xtabs(~ food + animal, data = dat)
# animal
# food bird horse monkey
# banana 0 0 1
# berries 1 0 0
# carrot 0 1 0
# hay 0 1 0
# seeds 1 1 0
从这里开始,下一步将取决于您的需求。例如,如果您想要基于food
的比例,那么
xt <- xtabs(~ food + animal, data = dat)
rowSums(xt)
# banana berries carrot hay seeds
# 1 1 1 1 2
xt / rowSums(xt)
# animal
# food bird horse monkey
# banana 0.0 0.0 1.0
# berries 1.0 0.0 0.0
# carrot 0.0 1.0 0.0
# hay 0.0 1.0 0.0
# seeds 0.5 0.5 0.0
(如果需要,请乘以100)
(事后看来,我认为Dominik在这里使用prop.table
更合适。)
数据:
dat <- structure(list(animal = c("horse", "bird", "monkey", "horse",
"bird", "horse"), food = c("carrot", "seeds", "banana", "hay",
"berries", "seeds")), class = "data.frame", row.names = c("1",
"2", "3", "4.", "5", "6."))
答案 2 :(得分:0)
尝试此操作,您必须为计数创建一个num
列:
library(tidyr)
df <- structure(list(animal = structure(c(2L, 1L, 3L, 2L, 1L, 2L), .Label = c("bird",
"horse", "monkey"), class = "factor"), food = structure(c(3L,
5L, 1L, 4L, 2L, 5L), .Label = c("banana", "berries", "carrot",
"hay", "seeds"), class = "factor"), num = c(1, 1, 1, 1, 1, 1)), row.names = c("1",
"2", "3", "4.", "5", "6."), class = "data.frame")
#Code
df$num <- 1
df2 <- pivot_wider(df,names_from = animal,values_from = num)
df2$Total <- rowSums(df2[,-1],na.rm=T)
df3 <- cbind(df2[,1,drop=F,],as.data.frame(lapply(df2[,-c(1,5)], function(x) x/df2$Total )))
food horse bird monkey
1 carrot 1.0 NA NA
2 seeds 0.5 0.5 NA
3 banana NA NA 1
4 hay 1.0 NA NA
5 berries NA 1.0 NA