我有一个数据框" df"由5个变量和366个观测值组成。 Ta,Tb和Tc分别是W乘以a,b和c的结果。
day <- as.Date(seq(as.Date("2003-04-01"), as.Date("2004-03-31"), by = 1), format="%Y-%m-%d")
W <- sample(10:1500, 366, replace=T)
a <- runif(366, 0.005, 2.3)
b <- runif(366, 0.5, 3.1)
c <- runif(366, 0.003, 0.04)
df <- data.frame(day, W, a, b, c)
df1 <- df %>% mutate(Ta = W*a, Tb = W*b , Tc = W*c)
我想计算Ta,Tb和Tc中每个值的比例。我这样做了
library(dplyr)
df1 <- df1 %>%
mutate(Ta.perc= Ta/sum(Ta)*100, Tb.perc= Tb/sum(Tb)*100, Tc.perc= Tc/sum(Tc)*100)
但是,如果我的数据框(df)如下转换为长格式(df2),我如何计算列中每个单元格的比例&#34; T_param&#34;对于每个变量a,b和c。
library(tidyr)
df2 <- gather(df, "var", "value", 3:5)
df2$T_param <- df2$W *df2$value
答案 0 :(得分:2)
您需要的是group_by
功能
df2 %>% group_by(var) %>% mutate(T.perc=T_param/sum(T_param)*100)
# Source: local data frame [1,098 x 6]
# Groups: var [3]
# day W var value T_param T.perc
# (date) (int) (fctr) (dbl) (dbl) (dbl)
# 1 2003-04-01 1006 a 2.2037060 2216.9283 0.66052866
# 2 2003-04-02 270 a 1.3955652 376.8026 0.11226747
# 3 2003-04-03 783 a 0.1573310 123.1902 0.03670423
# 4 2003-04-04 80 a 1.5705017 125.6401 0.03743419
# 5 2003-04-05 1224 a 0.2571567 314.7598 0.09378197
# 6 2003-04-06 813 a 0.7835079 636.9919 0.18979026
# 7 2003-04-07 1144 a 1.2742529 1457.7453 0.43433185
# 8 2003-04-08 1252 a 2.2194189 2778.7125 0.82791098
# 9 2003-04-09 503 a 0.1744863 87.7666 0.02614986
# 10 2003-04-10 323 a 1.5328218 495.1014 0.14751433
# .. ... ... ... ... ... ...