dplyr总结str.default(obj,...)中的错误dims [product 11]与对象的长度不匹配[3]

时间:2016-07-28 14:26:59

标签: r ggplot2 group-by dplyr summary

使用dplyr group_bysummarise函数时,我遇到了非常令人沮丧的问题。

这是我的数据集:

> cum_ems_totals
Source: local data frame [12 x 4]

   Chamber Total_emmissions Treatment  Block
    <fctr>            <dbl>    <fctr> <fctr>
1        1        5769.0507         U      1
2        3        7790.1426        IU      1
3        4        5166.8992        AN      1
4        5        7625.7319        AN      2
5        6        1964.0970        IU      2
6        7        5052.1268         U      2
7        9        4207.5324        IU      3
8       10         470.7014        AN      3
9       12        5675.9171         U      3
10      14        5666.1678         U      4
11      15        2134.5002        AN      4
12      16        4093.4687        IU      4

> str(cum_ems_totals)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   12 obs. of  4 variables:
 $ Chamber         : Factor w/ 13 levels "1","3","4","5",..: 1 2 3 4 5 6 7 8 9 11 ...
 $ Total_emmissions: num [1:101, 1] 5769 7790 5167 7626 1964 ...
 $ Treatment       : Factor w/ 4 levels "U","IU","AN",..: 1 2 3 3 2 1 2 3 1 1 ...
 $ Block           : Factor w/ 5 levels "1","2","3","13",..: 1 1 1 2 2 2 3 3 3 5 ...

我现在想通过治疗来计算一些摘要统计数据:

cum_ems_summary <- cum_ems_totals %>% filter(Chamber != "10") %>% 
  group_by(Treatment) %>% 
  summarise(n = n(), Mean = mean(Total_emmissions, na.rm = TRUE),
                      SD = sd(Total_emmissions, na.rm = TRUE), SEM = SD/sqrt(n))

这给了我:

> cum_ems_summary
Source: local data frame [3 x 5]

  Treatment     n     Mean        SD       SEM
     <fctr> <int>    <dbl>     <dbl>     <dbl>
1         U     4 5540.816  329.0763  164.5381
2        IU     4 4513.810 2415.6355 1207.8178
3        AN     3 4975.710 2750.6038 1588.0618

到目前为止一切顺利。但是,如果我尝试使用ggplot对此数据进行绘图,则会出现以下错误:

> ggplot(cum_ems_summary, aes(x = Treatment, y = Mean, fill = Treatment)) + geom_bar(stat = "identity")
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 3, 11

数据框的str给出了这个:

> str(cum_ems_summary)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   3 obs. of  5 variables:
 $ Treatment: Factor w/ 4 levels "U","IU","AN",..: 1 2 3
 $ n        : int  4 4 3
 $ Mean     :
Error in str.default(obj, ...) : 
  dims [product 11] do not match the length of object [3]

我不明白这里发生了什么!任何人都可以帮忙吗?

3 个答案:

答案 0 :(得分:3)

#Reproduce error
str(cum_ems_summary)
# Error in str.default(obj, ...) : 
#   dims [product 11] do not match the length of object [3]

#Fix
cum_ems_totals$Total_emmissions <- c(cum_ems_totals$Total_emmissions)


#Try again
cum_ems_summary <- cum_ems_totals %>% filter(Chamber != "10") %>% 
  group_by(Treatment) %>% 
  summarise(n = n(), Mean = mean(Total_emmissions, na.rm = TRUE),
            SD = sd(Total_emmissions, na.rm = TRUE), SEM = SD/sqrt(n))

ggplot(cum_ems_summary, aes(x = Treatment, y = Mean, fill = Treatment)) + geom_bar(stat = "identity")

enter image description here

答案 1 :(得分:1)

我刚遇到同样的问题并通过在最后添加mutate_if来解决它,以防它有用:

df2<- df%>% 
  group_by(group) %>% 
  mutate_each(funs(scale, mean)) %>% 
  mutate_if(is.matrix, as.vector)

答案 2 :(得分:-1)

错误信息可能与您的治疗有4个级别的事实无关吗?当它应该有3个级别&#34; U&#34;,&#34; IU&#34;,&#34; AN&#34;分配的级别为1,2,3和额外级别&#34; ..&#34;没有分配号码。