我的数据框结构如下:
df <- structure(list(name1 = c("A","A","B","B","A","A","B","B"),
name2 = c("B","B","C","C","ALL","ALL","ALL","ALL"),
pair_id = c(1,1,2,2,3,3,4,4),
year = c(2010, 2011, 2010, 2011, 2010, 2011,2010, 2011),
var1 = c(1.5,2,4,5,12,15,20,18)),
.Names = c("name1","name2","pair_id","year", "var1"),
row.names = c("1", "2", "3", "4", "5", "6", "7", "8"), class =("data.frame"))
我想计算每年var1的百分比份额(分母为name2 = ALL)和pair_id。输出应如下所示:
df <- structure(list(name1 = c("A","A","B","B","A","A","B","B"),
name2 = c("B","B","C","C","ALL","ALL","ALL","ALL"),
pair_id = c(1,1,2,2,3,3,4,4),
year = c(2010, 2011, 2010, 2011,2010,2011,2010,2011),
var1 = c(1.5,2,4,5,12,15,18,20),
var1_share = c(0.125,0.133333,0.2,0.2777,1,1,1,1)),
.Names = c("name1","name2","pair_id","year", "var1","var1_share"),
row.names = c("1", "2", "3", "4", "5", "6", "7", "8"), class =("data.frame"))
提前谢谢!
答案 0 :(得分:1)
dplyr
解决方案:
df %>%
group_by(name1, year) %>%
mutate(denom = var1[name2 == "ALL"]) %>%
mutate(var1_share = var1/denom)
# # A tibble: 8 x 7
# # Groups: name1, year [4]
# name1 name2 pair_id year var1 denom var1_share
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 A B 1 2010 1.5 12 0.1250000
# 2 A B 1 2011 2.0 15 0.1333333
# 3 B C 2 2010 4.0 20 0.2000000
# 4 B C 2 2011 5.0 18 0.2777778
# 5 A ALL 3 2010 12.0 12 1.0000000
# 6 A ALL 3 2011 15.0 15 1.0000000
# 7 B ALL 4 2010 20.0 20 1.0000000
# 8 B ALL 4 2011 18.0 18 1.0000000