汇总百分比计算无效的“类型”错误

时间:2020-10-13 19:44:40

标签: r dplyr

我有下面的代码正在运行,但是现在抛出错误。我认为某个软件包可能已更新并破坏了。

scorecard_data %>% 
  select (STABBR, HBCU, MENONLY, WOMENONLY) %>%
  filter (str_detect(STABBR, "OH|PA|WV|KY|IN|MI")) %>%
  group_by (STABBR) %>% 
  summarize (prcntHBCU = (sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)])*100),
             prcntMEN = (sum(MENONLY, na.rm = TRUE)/length(MENONLY[!is.na(MENONLY)])*100),
             prcntWOMEN = (sum(WOMENONLY, na.rm = TRUE)/length(WOMENONLY[!is.na(WOMENONLY)])*100)) %>%
  gather(key = 'Type.prcnt', value = 'Prcnt', prcntHBCU:prcntWOMEN) %>% 
  ggplot (aes (x = STABBR, y = Prcnt, fill = Type.prcnt)) +
  geom_col(stat = "identity", position = "dodge") + 
  ggtitle ("% of HBCUs, Men Only, and Women Only Institutions - by OH and Neighboring States") +
  xlab ("State") +
  ylab ("Percent of Institutions")

这是我运行R Studio时出现的错误...

Error: Problem with `summarise()` input `prcntHBCU`.
x invalid 'type' (character) of argument
i Input `prcntHBCU` is `(sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)]) * 100)`.
i The error occurred in group 1: STABBR = "IN".
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/dplyr_error>
Problem with `summarise()` input `prcntHBCU`.
x invalid 'type' (character) of argument
i Input `prcntHBCU` is `(sum(HBCU, na.rm = TRUE)/length(HBCU[!is.na(HBCU)]) * 100)`.
i The error occurred in group 1: STABBR = "IN".
Backtrace:
  1. dplyr::select(., STABBR, HBCU, MENONLY, WOMENONLY)
  1. dplyr::filter(., str_detect(STABBR, "OH|PA|WV|KY|IN|MI"))
  1. dplyr::group_by(., STABBR)
  2. dplyr::summarize(...)
 14. dplyr:::h(simpleError(msg, call))

任何人都可以帮助调试并告诉我为什么它不起作用吗?

1 个答案:

答案 0 :(得分:0)

基于@gregmacfarlane和@Calumn_You,原因可能是将sum应用于字符向量。

一个简单的方法是检查变量的类型为summary(scorecard_data)。数字变量将给出最小值,最大值,中位数。字符变量只会说变量类型是字符。因子变量将统计不同的计数。

假设字符串是数字,则可以使用as.numeric将字符转换为数字。如果变量是一个因素,通常最好先转换为字符,然后使用as.numericas.character转换为数字。

所以您可能正在寻找类似的解决方案:

scorecard_data %>%
  mutate(STABBR = as.numeric(STABBR), # if STABBR is of type character
         HBCU = as.numeric(as.character(HBCU)), # if HBCU is of type factor
         MENONLY = as.numeric(MENONLY),
         WOMENONLY = as.numeric(WOMENONLY)) %>%
#  the rest of your code follows here