添加最多出现在data.frame的每一行中的值

时间:2017-11-10 13:11:51

标签: r dataframe

我有以下数据框:

> head (PRED_BEST_SYSM_TEST2)
  V1 V2 V3 V4 V5
1  0  0  0  0  0
2  2  2  2  1  1
3  0  0  0  0  0
4  0  3  4  0  0
5  5  5  1  2  0
6  0  0  0  1  1

我想在数据框中添加列,其中包含每行中出现次数最多的数字。如下:

  V1 V2 V3 V4 V5 max_res
1  0  0  0  0  0    0
2  2  2  2  1  1    2
3  0  0  0  0  0    0
4  0  3  4  0  0    0
5  5  5  1  2  0    5
6  0  0  1  1  1    1

我使用以下代码:

g <- function(df)
{
  X <- as.data.frame(t(apply( df, 1,
                              function(row)
                              {
                                u <- unique(row)
                                n <- rowSums(outer(u,row,"=="))
                                if (length(u)==1 )
                                {
                                  c(row,u[which.max(n)],max(n),"",0)
                                }
                                else
                                {
                                  c(row,u[which.max(n)],max(n))
                                }
                              })))  

  colnames(X) <- c(colnames(df),"max_res")

  return(X)
}

g1<-g(PRED_BEST_SYSM_TEST2)

当我尝试时 >head (g1) 我得到了非常奇怪的结果,例如:

  NA                  NA                  NA                  NA                  NA
                        NA                  NA                  NA                  NA                  NA
                        NA                  NA                  NA                  NA                  NA
                   NA                  NA                  NA                  NA                  NA                  NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                       NA                  NA

PRED_BEST_SYSM_TEST2数据框详细信息为:

 > str (PRED_BEST_SYSM_TEST2)
'data.frame':   100000 obs. of  5 variables:
 $ V1: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 2 ...
 $ V2: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 2 2 1 2 ...
 $ V3: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 1 ...
 $ V4: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 1 2 1 2 ...
 $ V5: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 2 2 1 1 ...

0 个答案:

没有答案