Question

使用代码获取每个物种的平均花瓣长度

group_sp<-group_by(iris,iris$Species)
mean_plength<-summarise(group_sp,mean(iris$Petal.Length))
mean_plength

获得输出

`iris$Species` `mean(iris$Petal.Length)`
          <fctr>                     <dbl>
1         setosa                     3.758
2     versicolor                     3.758
3      virginica                     3.758

对于所有情况都不一样，应该是

setosa  1.464
versicolor  4.26
virginica   5.552

我的所有数据集都会发生这种情况，任何人都可以告诉我们这个问题

Answer 1

使用dplyr，您不引用列名称或使用通常的子集/选择语法引用列（即$，[或[[ ）。相反，第一个参数始终是数据，而其他参数引用列，就像它们是常规变量一样。

因此，您的示例的正确语法是：

library("dplyr")
group_sp <- group_by(iris, Species)
mean_plength <- summarise(group_sp, mean_petal_length = mean(Petal.Length))
mean_plength

您的案例中发生的事情是您在全局环境中引用了$Species和$Petal.Length列，而不是在相关函数调用创建的环境中。

另请注意，编写此代码的惯用方法更多：

iris %>% 
  group_by(Species) %>% 
  summarize(mean_petal_length = mean(Petal.Length))

这完全等同于以下内容，因为the magrittr pipe %>%是转发函数应用程序：

summarize(group_by(iris, Species), mean_petal_length = mean(Petal.Length))

总结为每个组提供相同的价值

1 个答案: