我有以下df8数据帧:
df8=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))
以下是每行的值出现次数:
a<-apply(df8,MARGIN=1,table)
> a
[[1]]
10 20
1 2
[[2]]
10 20 30
1 1 1
[[3]]
10 20
1 2
[[4]]
10 20 30
1 1 1
我有一个矢量 - V = (0.25,0.25,0.5)
这意味着我希望每行的每个行的每个出现次数乘以向量V
:
我想得到这样的东西用于计算(总结每个不同行值的列的权重):
[[1]]
10 20
0.25 0.5
[[2]]
10 20 30
0.5 0.25 0.25
[[3]]
10 20
0.25 0.5
[[4]]
10 20 30
0.5 0.25 0.25
现在我想为每一行选择a*V
值最高的项目:
> df8
V1 V2 V3 max_val
1 10 20 20 20
2 20 30 10 10
3 10 20 20 20
4 20 30 10 10
答案 0 :(得分:1)
一个选项可以是将table
函数应用于每一行,并找出每列中值的出现次数。然后,V
中定义的因子将应用于每列,以查找具有最大freq*V
值的列的索引。该行值index
的值将是所需的值。
#Multiplier for occurrence in each column
V = c(0.25,0.25,0.5)
#data frame
df8=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))
# This function accepts all columns for a row. Finds frequencies for each
# column values and then multiply with V (column wise)
# Finally value in row at index with max(freq*V) is returned.
find_max_freq_val <- function(x){
freq_df <- as.data.frame(table(x))
freq_vec <- mapply(function(y)freq_df[freq_df$x==y,"Freq"], x)
#multiply with V with freq and find index of max(a*V)
#Then return item at that index from x
x[which((freq_vec*V) == max(freq_vec*V))]
}
# call above function to add an column with desired value
df8$new_val <- apply(df8, 1, find_max_freq_val)
df8
# V1 V2 V3 new_val
#1 10 20 20 20
#2 20 30 10 10
#3 10 20 20 20
#4 20 30 10 10