我有以下数据框:
> head (PRED_BEST_SYSM_TEST2)
V1 V2 V3 V4 V5
1 0 0 0 0 0
2 2 2 2 1 1
3 0 0 0 0 0
4 0 3 4 0 0
5 5 5 1 2 0
6 0 0 0 1 1
我想在数据框中添加列,其中包含每行中出现次数最多的数字。如下:
V1 V2 V3 V4 V5 max_res
1 0 0 0 0 0 0
2 2 2 2 1 1 2
3 0 0 0 0 0 0
4 0 3 4 0 0 0
5 5 5 1 2 0 5
6 0 0 1 1 1 1
我使用以下代码:
g <- function(df)
{
X <- as.data.frame(t(apply( df, 1,
function(row)
{
u <- unique(row)
n <- rowSums(outer(u,row,"=="))
if (length(u)==1 )
{
c(row,u[which.max(n)],max(n),"",0)
}
else
{
c(row,u[which.max(n)],max(n))
}
})))
colnames(X) <- c(colnames(df),"max_res")
return(X)
}
g1<-g(PRED_BEST_SYSM_TEST2)
当我尝试时
>head (g1)
我得到了非常奇怪的结果,例如:
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
PRED_BEST_SYSM_TEST2
数据框详细信息为:
> str (PRED_BEST_SYSM_TEST2)
'data.frame': 100000 obs. of 5 variables:
$ V1: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 2 ...
$ V2: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 2 2 1 2 ...
$ V3: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 1 ...
$ V4: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 1 2 1 2 ...
$ V5: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 2 2 1 1 ...