我有以下数据集:
adv_id<-c("abc","abc","abc","pqr","pqr","pqr","xyz","xyz","xyz")
seg_id<-c("1","2","3","4","5","6","7","8","9")
value<-c(120,450,108,1004,567,768,111,222,3334)
data<-data.frame(adv_id,seg_id,value)
我想在adv_id级别计算父级总百分比。就像我们使用excel枢轴一样。所以我的输出应该看起来像
adv_id seg_id value percentage
abc 1 120 17.7%
abc 2 450 66.37%
abc 3 108 15.93%
pqr 4 1004 42.92%
pqr 5 567 24.24%
pqr 6 768 32.83%
xyz 7 111 3.03%
xyz 8 222 6.05%
xyz 9 3334 90.92%
我使用了以下代码
totaluser=0
percent=c()
for (i in 2:nrow(data))
{
if data$adv_id[i]==data$adv_id[i-1]
{
totaluser = totaluser+rawdata$value[i]
percentage[i-1] = (data$value[i-1]/totaluser)*100
}
else {totaluser=0}
}
并使用sqldf包
percentage<-sqldf("select adv_id,seg_id,value,sum(value)/sum(sum(value)) over(partition by adv_id) as 'percentage' from data").
但两次都无法获得所需的输出。 我们能以某种方式在dplyr包中使用汇总函数吗?