Question

我有下面的代码正在运行，但是现在抛出错误。我认为某个软件包可能已更新并破坏了。

scorecard_data %>% 
  select (STABBR, HBCU, MENONLY, WOMENONLY) %>%
  filter (str_detect(STABBR, "OH|PA|WV|KY|IN|MI")) %>%
  group_by (STABBR) %>% 
  summarize (prcntHBCU = (sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)])*100),
             prcntMEN = (sum(MENONLY, na.rm = TRUE)/length(MENONLY[!is.na(MENONLY)])*100),
             prcntWOMEN = (sum(WOMENONLY, na.rm = TRUE)/length(WOMENONLY[!is.na(WOMENONLY)])*100)) %>%
  gather(key = 'Type.prcnt', value = 'Prcnt', prcntHBCU:prcntWOMEN) %>% 
  ggplot (aes (x = STABBR, y = Prcnt, fill = Type.prcnt)) +
  geom_col(stat = "identity", position = "dodge") + 
  ggtitle ("% of HBCUs, Men Only, and Women Only Institutions - by OH and Neighboring States") +
  xlab ("State") +
  ylab ("Percent of Institutions")

这是我运行R Studio时出现的错误...

Error: Problem with `summarise()` input `prcntHBCU`.
x invalid 'type' (character) of argument
i Input `prcntHBCU` is `(sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)]) * 100)`.
i The error occurred in group 1: STABBR = "IN".
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/dplyr_error>
Problem with `summarise()` input `prcntHBCU`.
x invalid 'type' (character) of argument
i Input `prcntHBCU` is `(sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)]) * 100)`.
i The error occurred in group 1: STABBR = "IN".
Backtrace:
  1. dplyr::select(., STABBR, HBCU, MENONLY, WOMENONLY)
  1. dplyr::filter(., str_detect(STABBR, "OH|PA|WV|KY|IN|MI"))
  1. dplyr::group_by(., STABBR)
  2. dplyr::summarize(...)
 14. dplyr:::h(simpleError(msg, call))

任何人都可以帮助调试并告诉我为什么它不起作用吗？

Answer 1

基于@gregmacfarlane和@Calumn_You，原因可能是将sum应用于字符向量。

一个简单的方法是检查变量的类型为summary(scorecard_data)。数字变量将给出最小值，最大值，中位数。字符变量只会说变量类型是字符。因子变量将统计不同的计数。

假设字符串是数字，则可以使用as.numeric将字符转换为数字。如果变量是一个因素，通常最好先转换为字符，然后使用as.numeric和as.character转换为数字。

所以您可能正在寻找类似的解决方案：

scorecard_data %>%
  mutate(STABBR = as.numeric(STABBR), # if STABBR is of type character
         HBCU = as.numeric(as.character(HBCU)), # if HBCU is of type factor
         MENONLY = as.numeric(MENONLY),
         WOMENONLY = as.numeric(WOMENONLY)) %>%
#  the rest of your code follows here

汇总百分比计算无效的“类型”错误

1 个答案: