编织降价文件时,我在引用R中group by语句的输出时遇到问题。当我引用group_by的输出的变量名并进行汇总语句时,我收到一条错误消息,指出该变量不存在。
以下是在R Studio中运行但在R Markdown中失败的代码版本。
DF1 <- data.frame(name = c("1", "1", "2", "2", "3", "1" ),
s_id = c("ab", "ab", "cd", "ab", " bc", "ab"),
t_id = c("12A", "12A", "12A", "14B", "14B", "14B"))
breakdown <- DF1 %>%
group_by(name, s_id) %>%
summarise(count = n_distinct(t_id))
breakdown_v2 <- mutate(.data = breakdown,
number_of_trips = ifelse (s_id == 'ab', (count*5),
ifelse (s_id == 'cd', (count*2), (count*1))))
以前我也发生过类似的事情,这使我不得不在summary语句中明确说明s_id,但这次对我来说不起作用。
有什么主意吗? 谢谢
更新:使用的实际代码:
```{r Busiest/Quietest Routes}
# I needed to find the number of distinct trips per service before multiplying out the trips per week.
distinct_trips_breakdown <- Overall_Dublin_Bus_Record %>%
group_by(route_short_name, service_id) %>%
summarise(count = n_distinct(trip_id))
distinct_trips_breakdown <- mutate(.data = distinct_trips_breakdown,
number_of_trips_per_week = ifelse (service_id == '1', (count*5), ifelse (service_id == '2', (count*2), (count*1))))
Overall_trips_per_week <- distinct_trips_breakdown %>%
group_by(route_short_name) %>%
summarise(total_trips_per_week = sum(number_of_trips_per_week))
Busiest_Routes <- top_n(Overall_trips_per_week, 5)
Quiestest_Routes <- top_n(Overall_trips_per_week, -5)
```
答案 0 :(得分:0)
对我来说,代码工作正常。您能准确地编写代码块吗?这可能是空间问题,或者您未放入3`来结束大块。另外,请记住加载软件包。尝试使用对我有用的以下块(已加载了程序包):
```{r }
DF1 <- data.frame(name = c("1", "1", "2", "2", "3", "1" ),
s_id = c("ab", "ab", "cd", "ab", " bc", "ab"),
t_id = c("12A", "12A", "12A", "14B", "14B", "14B"))
breakdown <- DF1 %>%
group_by(name, s_id) %>%
summarise(count = n_distinct(t_id))
breakdown_v2 <- mutate(.data = breakdown,
number_of_trips = ifelse (s_id == 'ab', (count*5),
ifelse (s_id == 'cd', (count*2), (count*1))))
```
`
干杯!
答案 1 :(得分:0)
我最终解决了此错误。问题不是在dplyr之前加载plyr时出现的...通过在摘要之前放置dplyr ::来解决。
```{r }
DF1 <- data.frame(name = c("1", "1", "2", "2", "3", "1" ),
s_id = c("ab", "ab", "cd", "ab", " bc", "ab"),
t_id = c("12A", "12A", "12A", "14B", "14B", "14B"))
breakdown <- DF1 %>%
group_by(name, s_id) %>%
dplyr::summarise(count = n_distinct(t_id))
breakdown_v2 <- mutate(.data = breakdown,
number_of_trips = ifelse (s_id == 'ab', (count*5),
ifelse (s_id == 'cd', (count*2), (count*1))))
```