Question

我试图获取航班的距离摘要以INC文本结尾

所以我确实加入了两个数据库来获取名字

flights <- left_join(flights, airlines, by="carrier")

比我使用的功能：

> flights %>% select(name, ends_with("Inc.")) %>% summarise(dist=sum(flights$distance))
# A tibble: 1 x 1
       dist
      <dbl>
1 350217607

并尝试：

> flights %>% filter(name, ends_with("Inc.")) %>% summarise(dist=sum(flights$distance))
Error: No tidyselect variables were registered
Call `rlang::last_error()` to see a backtrace

但在第一种情况下，其对所有航空公司的简单总结而不是我指定的摘要应以“ Inc”结尾。第二次审判只是说错误等... 我在做什么错了？

谢谢

Answer 1

您可以通过多种方式执行此操作，如下所示

library(dplyr)
flights %>% filter(grepl("Inc.$", name)) %>% summarise(dist = sum(distance))

#       dist
#      <dbl>
#1 249500641

flights %>%  summarise(dist = sum(distance[grepl("Inc.$", name)]))

flights %>% slice(grep("Inc.$", name)) %>% summarise(dist = sum(distance))

或使用基数R

sum(with(flights, distance[endsWith(name, "Inc.")]))
#[1] 249500641

sum(with(flights, distance[grepl("Inc.$", name)]))

sum(with(flights, distance[grep("Inc.$", name)]))

还有一个注意事项：切勿在管道中更频繁地使用$，否则将使计算混乱。

Answer 2

我们可以使用tidyvverse方法

library(dplyr)
library(stringr)
flights %>%
     filter(str_detect(name, "Inc\\.$")) %>%
      summarise(dist = sum(distance))

如果我们将ends_with与select语句一起使用，它将检查列名称并选择匹配的列。在这里，OP要选择行。因此，该模式应与所选列名上的filter一起使用

如何知道ends_with选择的距离汇总

2 个答案: