Question

我正在使用R作为我的项目。我是R的新手。我有以下数据

place<-c("S1","S1","S1","S1","S2","S2","S2","S2")

product<-c("P1","P2","P3","P1","P2","P3","P1","P2")

location<-c("loc1","loc1","loc2","loc2","loc1","loc1","loc2","loc2")

profit<-c(55,80,70,90,30,40,15,20)

data<-data.frame(place,product,location,profit)

我想要每个地方，哪个产品在每个位置给出最大利润，在输出中它将添加一个带有二进制条目的列，其中1对应于向量中利润最大的位置，如下面的方式：

solution<-c(0,1,1,0,0,1,0,0)

希望我的问题很明确。提前谢谢。

Answer 1

您可以使用ave：

transform(data, solution = ave(profit, place, location, 
                               FUN = function(x) as.integer(x == max(x))))


  place product location profit solution
1    S1      P1     loc1     55        0
2    S1      P2     loc1     80        1
3    S1      P3     loc2     70        0
4    S1      P1     loc2     90        1
5    S2      P2     loc1     30        0
6    S2      P3     loc1     40        1
7    S2      P1     loc2     15        0
8    S2      P2     loc2     20        1

Answer 2

这个例子你期望这个向量吗？如果2个不同位置有2个不同的位置，“解决方案”如何只包含3“1”？

这是我的解决方案：

place<-c("S1","S1","S1","S1","S2","S2","S2","S2")

product<-c("P1","P2","P3","P1","P2","P3","P1","P2")

location<-c("loc1","loc1","loc2","loc2","loc1","loc1","loc2","loc2")

profit<-c(55,80,70,90,30,40,15,20)

data<-data.frame(place,product,location,profit)

    # Returns a data frame with the profit max for each place at each location
df <- aggregate(data$profit, by = list(place = data$place, location = data$location), max)
    # Formating names
    names(df)[3] <- c("profit")
    # All the lines returned are thoses you want to index with "1" in "solution
df$solution <- rep(1, nrow(df))

    # Right outter join, we keep all lines of data which don't respect the join criteria (we dont precise by.x and by.y, it's a natural join on the names, it will produce NA in "solution" for missing correspondances)
data <- merge(df, data, all.y = T)
    # The join produced NA solutions for lines which didn't exist in "data", we replace them by 0
data$solution[which(is.na(data$solution))] <- 0


> data
    place location profit solution product
1    S1     loc1     55        0      P1
2    S1     loc1     80        1      P2
3    S1     loc2     70        0      P3
4    S1     loc2     90        1      P1
5    S2     loc1     30        0      P2
6    S2     loc1     40        1      P3
7    S2     loc2     15        0      P1
8    S2     loc2     20        1      P2

> data$solution
[1] 0 1 0 1 0 1 0 1

希望得到这个帮助。

矩阵列中的最大值

2 个答案: