Group_By和Sum()给出了相当不可预见的结果

时间:2018-01-08 02:54:08

标签: r excel group-by dplyr summarize

我从excel导入了一个包含1071行和16列的文件,代表了自2005年以来在巴西赢得能源拍卖的所有可再生能源项目。

> HIST <- read_excel("D:/Paulo/Desktop/2018 Historico Leiloes Total para R.xls", 
+     col_types = c("numeric", "text", "text", 
+         "text", "text", "text", "date", "numeric", 
+         "numeric", "numeric", "numeric", 
+         "numeric", "numeric", "numeric", 
+         "numeric", "numeric"))

> HIST
# A tibble: 1,071 x 16
     Ano        Leilao Fonte    UF                Vend
   <dbl>         <chr> <chr> <chr>               <chr>
 1  2011      2011 LER   Bio    SP      Louis Dreyffus
 2  2008 2008 Leilao 1   Bio    MG                CMAA
 3  2008 2008 Leilao 1   Bio    SP              RAIZEN
 4  2017      2017 A-4    PV    PI                Enel
 5  2013    2013 A-5 2  PPEE    BA               CHESF
 6  2008 2008 Leilao 1   Bio    SP             Abengoa
 7  2008 2008 Leilao 1   Bio    GO                 N/A
 8  2009      2009 A-3  PPEE    RS           Eletrosul
 9  2017      2017 A-4    PV    BA SOLATIO SINDUSTRIAL
10  2009      2009 A-3  PPEE    RS           Eletrosul
# ... with 1,061 more rows, and 11 more variables: Projeto <chr>, CODPPA <dttm>,
#   CAPEX <dbl>, MW <dbl>, GF <dbl>, FC <dbl>, PPA <dbl>, RMW <dbl>, WACC <dbl>,
#   TIR <dbl>, VPL <dbl>

然后,我加载了dplyr:

> library(dplyr)

最后,当我按年份(ANO用葡萄牙语)分组并总结MW(Megawatts)的总和时 - 我想知道每年拍卖多少兆瓦 - 我得到以下相当令人失望的结果:

HIST %>%
+     group_by(HIST$Ano)%>%
+     summarise(sum(HIST$MW))

# A tibble: 13 x 2
   HIST$Ano sum(HIST$MW)
    <dbl>          <dbl>
 1       2005       72677.67
 2       2006       72677.67
 3       2007       72677.67
 4       2008       72677.67
 5       2009       72677.67
 6       2010       72677.67
 7       2011       72677.67
 8       2012       72677.67
 9       2013       72677.67
10       2014       72677.67
11       2015       72677.67
12       2016       72677.67
13       2017       72677.67`

它不应该是每年的MW总和吗?它显示了MW的总计,并且每年重复该值。我做错了什么?

谢谢你,保罗

1 个答案:

答案 0 :(得分:2)

HIST %>%
    group_by(Ano) %>%
    summarise(soma = sum(MW))