Question

我有一个大型数据框（AT_df），在许多国家/地区使用了很多年，但没有年度总计。初始数据集已经精简为Pollutant_name（x1 =“ CO2”），我将所有子类别都放到了一个国家/地区。

我正在准备此数据以便以后运行ggplot2，但是为此，我需要为每年添加一行，并列出类别的总数（= 1-6）。

数据如下（摘录）：

       x     y          x1      x2      x4   x6
1553   1993  0.00000    CO2     Austria  6   6 - Other Sector
1554   2006  0.00000    CO2     Austria  6   6 - Other Sector
1555   2015  0.00000    CO2     Austria  6   6 - Other Sector
2243   1998  12.07760   CO2     Austria  5   5 - Waste management
2400   1992  11.12720   CO2     Austria  5   5 - Waste management
2401   1995  11.11040   CO2     Austria  5   5 - Waste management
2402   2006  10.26000   CO2     Austria  5   5 - Waste management
2489   1998  0.00000    CO2     Austria  6   6 - Other Sector

我想插入一个标记为（x6 =聚合）的行，并在x =年xyz和x2 = country_xyz的条件下求和y（排放）的值。

基本上是这样的

sum(AT_df, x4 %in% c("1", "2", "3", "4", "5", "6") & x ="yearxyz" & 
x2="Austria").

然后将其插入“每年”（总共16年）数据框中。

虽然我已经尝试了一些关于stackoverflow的内容，例如：

rbind(AT_df, data.frame(x1='Aggregate', y = sum(AT_df$y)))

...我无法编写任何正常工作的代码

在任何情况下都感谢您并提供任何帮助。

Answer 1

您可以首先准备一个包含摘要数据的数据框，其形状与您的AT_df相同，然后将两者合并。在R中，有很多方法可以做到这一点。在这里，我使用dplyr包。由于样本数据不足以完全显示这一点，因此我还将首先创建一些人工数据。之后，必须执行以下步骤：

命名汇总时应保留的所有列（功能group_by）。
汇总一些列，并将输出分配给列（函数summarise）。
为现在缺少的变量（函数mutate）添加一列。
将结果数据帧与原始数据帧合并（函数union_all）

最后的filter仅用于显示一些代表性数据。

set.seed(42)
df <- expand.grid(year = 1993:2015,
                  pollutant = "CO2",
                  country = LETTERS,
                  sector = 1L:6L)

df$amount <- runif(nrow(df), 0, 15)

library("dplyr")
df %>%
  group_by(year, pollutant, country) %>%
  summarise(amount = sum(amount)) %>%
  mutate(sector = -1L) %>%
  union_all(df) %>%
  filter(country == "A" & year == 1996)
#> # A tibble: 7 x 5
#> # Groups:   year, pollutant [1]
#>    year pollutant country amount sector
#>   <int> <fct>     <fct>    <dbl>  <int>
#> 1  1996 CO2       A        41.5      -1
#> 2  1996 CO2       A        12.5       1
#> 3  1996 CO2       A         4.24      2
#> 4  1996 CO2       A         6.70      3
#> 5  1996 CO2       A         1.88      4
#> 6  1996 CO2       A         9.40      5
#> 7  1996 CO2       A         6.82      6

具有多年和年度总计的数据框[R]

1 个答案: