民间,
从下面的df中,我想计算归一化值@ skew sk0。 Sk0是给定组的中心偏差,每组只有一个sk0。每组病例数为1或更多。
group=c("g1", "g1", "g1", "g1", "g2", "g3", "g3", "g3", "g4", "g4")
skew=c("sk0", "sk1", "sk2", "sk3", "sk0", "sk2", "sk0", "sk1", "sk1", "sk0")
value=c(0.5, 0.3, 0.8, 1.0, 0.1, 0.4, 0.9, 0.7, 0.6, 0.2)
df = data.frame(group, skew, value)
期望的结果如下所示。 valueNorm =相关组的值/ sk0。 例如。第1行和第1行4是g1组。组g1的中心偏斜sk0位于第1行,其值为0.5。因此,rowa 1->的值。 4将被除以0.5
group skew value GroupSk0 valueNorm
1 g1 sk0 0.5 0.5 1.00
2 g1 sk1 0.3 0.5 0.60
3 g1 sk2 0.8 0.5 1.60
4 g1 sk3 1.0 0.5 2.00
5 g2 sk0 0.1 0.1 1.00
6 g3 sk2 0.4 0.9 0.44
7 g3 sk0 0.9 0.9 1.00
8 g3 sk1 0.7 0.9 0.78
9 g4 sk1 0.6 0.2 3.00
10 g4 sk0 0.2 0.2 1.00
谢谢你的帮助!
答案 0 :(得分:0)
基础R解决方案:
df <- merge(df, df[df$skew == "sk0", c("group", "value")], by.x = "group", by.y = "group", suffixes = c("", "GroupSK0"))
names(df) <- gsub("valueGroupSK0", "GroupSK0", names(df))
df$valueNorm <- df$value/df$GroupSK0
df
## group skew value GroupSK0 valueNorm
## 1 g1 sk0 0.5 0.5 1.0000000
## 2 g1 sk1 0.3 0.5 0.6000000
## 3 g1 sk2 0.8 0.5 1.6000000
## 4 g1 sk3 1.0 0.5 2.0000000
## 5 g2 sk0 0.1 0.1 1.0000000
## 6 g3 sk2 0.4 0.9 0.4444444
## 7 g3 sk0 0.9 0.9 1.0000000
## 8 g3 sk1 0.7 0.9 0.7777778
## 9 g4 sk1 0.6 0.2 3.0000000
## 10 g4 sk0 0.2 0.2 1.0000000
基于data.table
的解决方案:
DT <- data.table(df)
DT[, GroupSk0 := .SD[skew=='sk0', value], by = group]
DT[, valueNorm := value / GroupSk0]
DT
## group skew value GroupSk0 valueNorm
## 1: g1 sk0 0.5 0.5 1.0000000
## 2: g1 sk1 0.3 0.5 0.6000000
## 3: g1 sk2 0.8 0.5 1.6000000
## 4: g1 sk3 1.0 0.5 2.0000000
## 5: g2 sk0 0.1 0.1 1.0000000
## 6: g3 sk2 0.4 0.9 0.4444444
## 7: g3 sk0 0.9 0.9 1.0000000
## 8: g3 sk1 0.7 0.9 0.7777778
## 9: g4 sk1 0.6 0.2 3.0000000
## 10: g4 sk0 0.2 0.2 1.0000000