Question

我想使用nycflights13数据集，使用R查找哪个航班是每月最新的；换句话说，每个月的离港延误最大。

我使用的代码：

flights %>% group_by(flights$month) %>% summarize(largest_delay = max(flights$dep_delay, na.rm=TRUE))

这给了我一个表格，列出了整个数据集的最大离港延误，而不是按月的最大值：

> flights %>% group_by(flights$month) %>% summarize(largest_delay = max(flights$dep_delay, na.rm=TRUE))
# A tibble: 12 x 2
   flights$month` largest_delay
             <int>         <dbl>
 1               1          1301
 2               2          1301
 3               3          1301
 4               4          1301
 5               5          1301
 6               6          1301
 7               7          1301
 8               8          1301
 9               9          1301
10              10          1301
11              11          1301
12              12          1301

我的问题：我将如何修改上面的代码，使之达到每月最大数量？另外，如何添加包含该航班对应的tailnum的附加列？

Answer 1

我们可以使用slice函数来做到这一点：

library(nycflights13)
library(dplyr)

flights %>%
    group_by(year, month) %>%
    slice(which.max(dep_delay))

如果您正在寻找基本的R解决方案，我们可以使用lapply，split和which：

do.call('rbind', 
       lapply(split(flights, list(flights$year, flights$month)), 
              FUN = function(d) d[which.max(d$dep_delay),]))

Answer 2

问题在于您的语法-您不应该在@Qualifier("DocumentServiceBase") private DocumentService documentService; @Autowired public DocumentoBase(@Qualifier("DocumentServiceBase") DocumentService documentService){ this.documentService = documentService; }管道内使用@Qualifier("DocumentServiceImplv13") private DocumentService documentService; @Autowired public DocumentoController(@Qualifier("DocumentServiceImplv13") DocumentService documentService){ super(documentService); this.documentService = documentService; }-您应该只使用变量名。您只需要

flights$

如何找到一个条件与另一列相关联的列的最大值

2 个答案: