我的问题如下:使用mutate我可以创建新列作为现有列的组合,但是如果我需要创建新行作为现有列的组合怎么办? 例如,考虑
df<-structure(list(year = c(2013L, 2014L, 2015L, 2016L, 2017L, 2013L,
2014L, 2015L, 2016L, 2017L), reporter = c("EU28", "EU28", "EU28",
"EU28", "EU28", "UK", "UK", "UK", "UK", "UK"), partner = c("ACP",
"ACP", "ACP", "ACP", "ACP", "ACP", "ACP", "ACP", "ACP", "ACP"
), nace = c("FDI", "FDI", "FDI", "FDI", "FDI", "FDI", "FDI",
"FDI", "FDI", "FDI"), inward_stock = c(85483.9, 108674.6, 98536.9,
114328.5, 174077.2, 4733.1, 5229.2, 5892.5, 7542.7, 20759), outward_stock = c(189229.3,
223497.6, 325336.3, 301348.9, 304675.4, 38683, 46732.6, 49357.3,
46985.6, 39748.4)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-10L))
我想在2013-2017年增加新的行,其中有新的报告员EU27,而内向和外向库存的值是没有英国出资的EU28之一。
例如,在2013年,EU27的inward_stock将为85484-4733 = 80751,outward_stock 189229.-38683 = 150546。
有人知道如何在不进行繁琐的枢转/取消枢纽的情况下实现这一目标吗?我想总共添加5个新行
谢谢!
答案 0 :(得分:2)
我认为在这种情况下,使用summarize()
为EU27创建数据,然后将其与原始数据绑定是很自然的:
library(tidyverse)
eu27 <- df %>%
group_by_at(vars(-reporter, -ends_with("stock"))) %>%
summarize_at(vars(inward_stock, outward_stock), ~ {
.x[reporter == "EU28"] - .x[reporter == "UK"]
}) %>% mutate(reporter = "EU27")
bind_rows(df, eu27)
#> # A tibble: 15 x 6
#> year reporter partner nace inward_stock outward_stock
#> <int> <chr> <chr> <chr> <dbl> <dbl>
#> 1 2013 EU28 ACP FDI 85484. 189229.
#> 2 2014 EU28 ACP FDI 108675. 223498.
#> 3 2015 EU28 ACP FDI 98537. 325336.
#> 4 2016 EU28 ACP FDI 114328. 301349.
#> 5 2017 EU28 ACP FDI 174077. 304675.
#> 6 2013 UK ACP FDI 4733. 38683
#> 7 2014 UK ACP FDI 5229. 46733.
#> 8 2015 UK ACP FDI 5892. 49357.
#> 9 2016 UK ACP FDI 7543. 46986.
#> 10 2017 UK ACP FDI 20759 39748.
#> 11 2013 EU27 ACP FDI 80751. 150546.
#> 12 2014 EU27 ACP FDI 103445. 176765
#> 13 2015 EU27 ACP FDI 92644. 275979
#> 14 2016 EU27 ACP FDI 106786. 254363.
#> 15 2017 EU27 ACP FDI 153318. 264927
这也是使用tidyr中新的pivot_*
函数进行繁琐的透视的版本:
df %>%
pivot_longer(cols = ends_with("stock"), names_to = "variable") %>%
pivot_wider(names_from = reporter) %>%
mutate(EU27 = EU28 - UK) %>%
pivot_longer(cols = c(EU28, UK, EU27), names_to = "reporter") %>%
pivot_wider(names_from = variable)
#> # A tibble: 15 x 6
#> year partner nace reporter inward_stock outward_stock
#> <int> <chr> <chr> <chr> <dbl> <dbl>
#> 1 2013 ACP FDI EU28 85484. 189229.
#> 2 2013 ACP FDI UK 4733. 38683
#> 3 2013 ACP FDI EU27 80751. 150546.
#> 4 2014 ACP FDI EU28 108675. 223498.
#> 5 2014 ACP FDI UK 5229. 46733.
#> 6 2014 ACP FDI EU27 103445. 176765
#> 7 2015 ACP FDI EU28 98537. 325336.
#> 8 2015 ACP FDI UK 5892. 49357.
#> 9 2015 ACP FDI EU27 92644. 275979
#> 10 2016 ACP FDI EU28 114328. 301349.
#> 11 2016 ACP FDI UK 7543. 46986.
#> 12 2016 ACP FDI EU27 106786. 254363.
#> 13 2017 ACP FDI EU28 174077. 304675.
#> 14 2017 ACP FDI UK 20759 39748.
#> 15 2017 ACP FDI EU27 153318. 264927
我认为以上内容揭示了一种可能有用的有趣模式。单个枢轴功能可以在一个调用中完成“更长”和“更宽”的步骤,具体步骤如下:
df %>%
pivot(ends_with("stock"), names_to = "variable", names_from = reporter) %>%
mutate(EU27 = EU28 - UK) %>%
pivot(c(EU28, UK, EU27), names_to = "reporter", names_from = variable)
由reprex package(v0.3.0)于2019-10-17创建
答案 1 :(得分:0)
我不确定在不进行透视操作的情况下您希望它有多动态,但是可以适合您的示例的解决方案
new_reporter <- 'EU27'
l1 <- split(df, df$reporter)
rbind(df, data.frame(year = l1[[1]][1],
reporter = new_reporter,
partner = l1[[1]]$partner,
nace = l1[[1]]$nace,
l1[[1]][c(5:6)] - l1[[2]][c(5:6)]))
# A tibble: 15 x 6
# year reporter partner nace inward_stock outward_stock
# <int> <chr> <chr> <chr> <dbl> <dbl>
# 1 2013 EU28 ACP FDI 85484. 189229.
# 2 2014 EU28 ACP FDI 108675. 223498.
# 3 2015 EU28 ACP FDI 98537. 325336.
# 4 2016 EU28 ACP FDI 114328. 301349.
# 5 2017 EU28 ACP FDI 174077. 304675.
# 6 2013 UK ACP FDI 4733. 38683
# 7 2014 UK ACP FDI 5229. 46733.
# 8 2015 UK ACP FDI 5892. 49357.
# 9 2016 UK ACP FDI 7543. 46986.
#10 2017 UK ACP FDI 20759 39748.
#11 2013 EU27 ACP FDI 80751. 150546.
#12 2014 EU27 ACP FDI 103445. 176765
#13 2015 EU27 ACP FDI 92644. 275979
#14 2016 EU27 ACP FDI 106786. 254363.
#15 2017 EU27 ACP FDI 153318. 264927
答案 2 :(得分:0)
我有这个解决方案,使用gather
中的spread
和dplyr
:
df %>%
gather(type, stock, -c(year, reporter, partner, nace)) %>%
spread(reporter, stock) %>%
mutate(EU27=EU28-UK) %>%
gather(reporter, stock, -c(year, partner, nace, type)) %>%
spread(type, stock)
输出:
# A tibble: 15 x 6
year partner nace reporter inward_stock outward_stock
<int> <chr> <chr> <chr> <dbl> <dbl>
1 2013 ACP FDI EU27 80751. 150546.
2 2013 ACP FDI EU28 85484. 189229.
3 2013 ACP FDI UK 4733. 38683
4 2014 ACP FDI EU27 103445. 176765
5 2014 ACP FDI EU28 108675. 223498.
6 2014 ACP FDI UK 5229. 46733.
7 2015 ACP FDI EU27 92644. 275979
8 2015 ACP FDI EU28 98537. 325336.
9 2015 ACP FDI UK 5892. 49357.
10 2016 ACP FDI EU27 106786. 254363.
11 2016 ACP FDI EU28 114328. 301349.
12 2016 ACP FDI UK 7543. 46986.
13 2017 ACP FDI EU27 153318. 264927
14 2017 ACP FDI EU28 174077. 304675.
15 2017 ACP FDI UK 20759 39748.
答案 3 :(得分:0)
使用data.table
包,我们可以通过以下方式解决问题
library(data.table)
setorder(setDT(df), year, -reporter)
df[, .(reporter = "EU27", partner = "ACP", nace = "FDI", inward_stock = diff(inward_stock), outward_stock = diff(outward_stock)), year] %>%
rbind(df)
另一种选择
library(data.table)
setDT(df)
df[, -"reporter"][, c(.(reporter = "EU27"), lapply(.SD, function(x) if(is.numeric(x)) -diff(x) else unique(x))), year] %>%
rbind(df)