我有一个数据样本,其值按ID和项目分组,我需要确定一个最小值,通过该最小值可以增加单个项目,以使ID的总体平均阈值达到0.90。
数据:
structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1",
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1",
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.7894
), CurrentAvg = c(0.7894, 0.8697, 0.8697)), class = "data.frame", row.names = c(NA,
-3L))
我可以通过以下语法获得每个商品的差值:
library(dplyr)
SampDF2<-SampDF %>%
group_by(ID,Item,CurrentAvg) %>%
mutate(Value.1.Increase = 0.90-Value.1)
结果:
structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1",
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1",
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.7894
), CurrentAvg = c(0.7894, 0.8697, 0.8697), Value.1.Increase = c(0.1106,
-0.0499999999999999, 0.1106)), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -3L), vars = c("ID",
"Item", "CurrentAvg"), labels = structure(list(ID = structure(c(1L,
2L, 2L), .Label = c("A1", "A2"), class = "factor"), Item = structure(c(1L,
1L, 2L), .Label = c("Item1", "Item2"), class = "factor"), CurrentAvg = c(0.7894,
0.8697, 0.8697)), class = "data.frame", row.names = c(NA, -3L
), vars = c("ID", "Item", "CurrentAvg"), drop = TRUE), indices = list(
0L, 2L, 1L), drop = TRUE, group_sizes = c(1L, 1L, 1L), biggest_group_size = 1L)
但是对于将ID导致CurrentAvg的ID增加到0.90阈值的项目值增加,此结果并不正确。
有什么方法可以添加两个新列(显示值增加的值增加列和用于确认新平均值满足0.90阈值的NewAvg列)吗?
只要我的手动计算正确,这将是理想的结果:
structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1",
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1",
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.8393
), CurrentAvg = c(0.7894, 0.8697, 0.8697), ValueIncrease = c(0.1106,
0.04999, 0.04999), NewAvg = c(0.9, 0.89465, 0.89465)), class = "data.frame", row.names = c(NA,
-3L))