Question

我很难计算数据帧中的出现次数。我的数据如下：

          animal               food

1          horse               carrot
2          bird                seeds
3         monkey               banana 
4.         horse               hay
5          bird                berries
6.         horse               seeds

我正试图弄清每种食物的动物分类。例如，我想发现马吃了60％的干草，而另外40％被鸟和猴子吃了。我该如何找到这些并将它们放在单独的数据框中？

新数据框应如下所示：

          food                 horse      bird       monkey

1          carrot               60%        0%        40%
2          seeds                20%        60%       20%
3          banana               0%         0%        100% 
4.         berries              30%        50%       20%
5.         hay                  100%       0%        0%

百分比显然不正确，这只是一个例子。

Answer 1

喜欢吗？

df <- data.frame(
  stringsAsFactors = FALSE,
                  animal = c("horse","bird",
                             "monkey","horse","bird","horse"),
                    food = c("carrot","seeds",
                             "banana","hay","berries","seeds")
      )

with(df, prop.table(table(food, animal), margin = 1)) * 100

         animal
food      bird horse monkey
  banana     0     0    100
  berries  100     0      0
  carrot     0   100      0
  hay        0   100      0
  seeds     50    50      0

Answer 2

您可以先计算总数：

xtabs(~ food + animal, data = dat)
#          animal
# food      bird horse monkey
#   banana     0     0      1
#   berries    1     0      0
#   carrot     0     1      0
#   hay        0     1      0
#   seeds      1     1      0

从这里开始，下一步将取决于您的需求。例如，如果您想要基于food的比例，那么

xt <- xtabs(~ food + animal, data = dat)
rowSums(xt)
#  banana berries  carrot     hay   seeds 
#       1       1       1       1       2 
xt / rowSums(xt)
#          animal
# food      bird horse monkey
#   banana   0.0   0.0    1.0
#   berries  1.0   0.0    0.0
#   carrot   0.0   1.0    0.0
#   hay      0.0   1.0    0.0
#   seeds    0.5   0.5    0.0

（如果需要，请乘以100）

（事后看来，我认为Dominik在这里使用prop.table更合适。）

数据：

dat <- structure(list(animal = c("horse", "bird", "monkey", "horse", 
"bird", "horse"), food = c("carrot", "seeds", "banana", "hay", 
"berries", "seeds")), class = "data.frame", row.names = c("1", 
"2", "3", "4.", "5", "6."))

Answer 3

尝试此操作，您必须为计数创建一个num列：

library(tidyr)

df <- structure(list(animal = structure(c(2L, 1L, 3L, 2L, 1L, 2L), .Label = c("bird", 
"horse", "monkey"), class = "factor"), food = structure(c(3L, 
5L, 1L, 4L, 2L, 5L), .Label = c("banana", "berries", "carrot", 
"hay", "seeds"), class = "factor"), num = c(1, 1, 1, 1, 1, 1)), row.names = c("1", 
"2", "3", "4.", "5", "6."), class = "data.frame")

#Code
df$num <- 1
df2 <- pivot_wider(df,names_from = animal,values_from = num)
df2$Total <- rowSums(df2[,-1],na.rm=T)

df3 <- cbind(df2[,1,drop=F,],as.data.frame(lapply(df2[,-c(1,5)], function(x) x/df2$Total )))

     food horse bird monkey
1  carrot   1.0   NA     NA
2   seeds   0.5  0.5     NA
3  banana    NA   NA      1
4     hay   1.0   NA     NA
5 berries    NA  1.0     NA

计算数据帧列R中的出现次数

3 个答案: