如何随机命名数据框的列或行?

时间:2015-03-10 10:29:10

标签: r

这个问题有类似的问题,但没有一个像Changing column names of a data frame in R

那样解决这个问题 实际上,我有一个像下面的矩阵

M <- data.frame(matrix(rnorm(5),100,50))

我试图为每一列创建一个名单,如下所示:

colnames(M) <- paste( LETTERS, "col", sep ="")

如果列数等于或小于字母数,这将起作用。如果我想

怎么办?

1-在字母结束后重复字母

2-随机生成具有特定单词但随机字母的每列的名称 像Ccol GFcol Mercol一样多列还是多行?

2 个答案:

答案 0 :(得分:2)

对于问题的第二部分(因为第一部分似乎由akrun解决),您可以尝试:

# Generate unique combinations of at most three letters
LET <- apply(expand.grid(LETTERS, LETTERS, LETTERS)[sample(1:676, dim(M)[2]),], 1, function(x) x[sample(1:3, sample(1:3))])
colnames(M) <- paste0(sapply(LET, paste0, collapse = ""), "col")

给出了:

 head(M, 2)
     AZFcol     OJcol      Gcol    ALPcol     NAcol     VAcol     KEcol      Acol     VBcol     HAcol
1 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018
2  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753
      KYcol    AARcol      Wcol     EAcol    OTAcol     AMcol     AAcol     QAcol      Acol     AMcol
1 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018
2  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753
      AScol     DQcol      Bcol      Jcol     BAcol     AIcol     WEcol    SAUcol      Acol      Acol
1 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018
2  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753
     RAOcol     JAcol    GAEcol    ABQcol     BAcol     TAcol    AAMcol    ACEcol      Kcol     NAcol
1 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018
2  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753
       Bcol    HAEcol     ABcol    AVDcol      Hcol     AQcol     WHcol    KIAcol     QLcol     FRcol
1 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018 -1.842018
2  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753  1.069753

答案 1 :(得分:0)

akrun给出了第一个答案:         rep(粘贴(LETTERS,&#34; col&#34;,sep =&#34;&#34;),length.out = ncol(M))

对于第二个,我看到的唯一困难是避免重新采样相同的字母,以便具有唯一的列号。这就像计算基数26,因此您可以先计算此基数,直到您的列数:

    GetNumberSuiteAnyBase <- function(lengthSuite,base){
        nB <- length(base) # radix of your base
        nDigits <- floor(log(lengthSuite-1)/log(nB))+1 # the number of digits you'll need
        numberSuite <- ""
        for(iDigit in 1:nDigits){
            newDigit <- rep(base,each=nB^(iDigit-1),length.out=lengthSuite)
            numberSuite <- paste0(newDigit,numberSuite)
        }
        return(numberSuite)
    }
    library("testthat")
    # as an example:
    expect_equal(as.numeric(GetNumberSuiteAnyBase(5,c(0,1))),c(0,1,10,11,100))
    # with your requirements
    colNames <- GetNumberSuiteAnyBase(ncol(M),LETTERS)

然后,如果您希望这些列名称是随机的,您可以使用:

    colNames <- paste0(sample(colNames),"col")