Question

我有一个看起来像这样的data.frame：

   Element1     Element2        Value           Index   
         a         cf            0.14             1           
         a         ng            0.25             1           
         a         ck            0.12             1         
         a         rt            0.59             1      
         a         pl            0.05             1          
         b         gh            0.02             2          
         b         er            0.91             2
         b         jk            0.87             2
         c         qw            0.23             3
         c         po            0.15             3

我想要以下输出：

   Element_a1     Element_a2    Value_a       Element_b1   Element_b2  Value_b
         a         cf            0.14             b            gh       0.02      
         a         ng            0.25             b            er       0.91   
         a         ck            0.12             b            jk       0.87
         a         rt            0.59             NA           NA       NA
         a         pl            0.05             NA           NA       NA

依旧......

我应用“split”功能根据“Index”列拆分初始data.frame 但我无法转换拆分的data.frame（即data.frames列表）因为单个data.frames的长度不是，所以在单个data.frame中等于。我试图申请（来自ply包）

x = do.call（rbind.fill，spl）

来自另一篇文章，但返回的是像初始一样的data.frame。

有人可以帮我吗？

最佳

F。

Answer 1

这是一种方法：

nRow <-  max(table(dat$Element1))          # maximum number of rows in a group
spl2 <- by(dat, dat$Element1, FUN = function(x) {           
  if (nRow > nrow(x)) {                    # insufficient number of rows?
    subdat <- dat[seq_len(nRow - nrow(x)), ]  # create a data frame
    subdat[ , ] <- NA                      # fill it with NAs
    return(rbind(x, subdat))}       # bind it to the subset and return the result
  return(x)                                # return the subset as it is
})
result <- do.call(cbind, spl2)             # bind all subsets together

Answer 2

我会在填充后使用split然后cbind。我从combining two data frames of different lengths借用cbindPad函数：

cbindPad <- function(...){
  args <- list(...)
  n <- sapply(args,nrow)
  mx <- max(n)
  pad <- function(x, mx){
    if (nrow(x) < mx){
      nms <- colnames(x)
      padTemp <- matrix(NA,mx - nrow(x), ncol(x))
      colnames(padTemp) <- nms
      return(rbind(x,padTemp))
    }
    else{
      return(x)
    }
  }
  rs <- lapply(args,pad,mx)
  return(do.call(cbind,rs))
}

## assume your data is in a data.frame called dat
dat_split <- split(dat, dat$Element1)
out <- do.call( cbindPad, dat_split )

在新数据中转换拆分的data.frame。帧

2 个答案: