R一个data.frame的字符列表

时间:2016-03-31 13:27:27

标签: r

我现在已经看了很长一段时间,但似乎无法解决这个问题,虽然我觉得这应该很容易。

我有54个因素包含不同数量的字符串,准确的路径名称。例如,以下是包含它们的元素的两个因素:

> PWe1
 [1] Gene_Expression                                        
 [2] miR-targeted_genes_in_muscle_cell_-_TarBase            
 [3] Generic_Transcription_Pathway

> PWe2
  [1] miR-targeted_genes_in_epithelium_-_TarBase                           
  [2] miR-targeted_genes_in_leukocytes_-_TarBase                           
  [3] miR-targeted_genes_in_lymphocytes_-_TarBase                          
  [4] miR-targeted_genes_in_muscle_cell_-_TarBase

我想做的是将它们组合成一个包含54列的大数据框,其中每列都有一个相应因子的名称。我已经尝试了cbind,cbind.data.frame和其他几个选项,但这些选项返回数值而不是字符串。

预期产出:

PWe1 PWe2
Gene_Expression miR-targeted_genes_in_epithelium_-_TarBase
miR-targeted_genes_in_muscle_cell_-_TarBase miR-targeted_genes_in_leukocytes_-_TarBase
Generic_Transcription_Pathway miR-targeted_genes_in_lymphocytes_-_TarBase
NA miR-targeted_genes_in_muscle_cell_-_TarBase

对于R来说,我是一个相当初学者,是否有人可以推动我寻求可能的解决方案?

提前致谢!

3 个答案:

答案 0 :(得分:2)

lst <- mget(ls(pattern="PW"))            #<--- Create list with all necessary vectors.
ind <- lengths(lst)                      #<--- find maximum length
as.data.frame(do.call(cbind, 
  lapply(lst, `length<-`, max(ind))))    #<--- Convert to data.frmae
#                                          PWe1                                        PWe2
# 1                             Gene_Expression  miR-targeted_genes_in_epithelium_-_TarBase
# 2 miR-targeted_genes_in_muscle_cell_-_TarBase  miR-targeted_genes_in_leukocytes_-_TarBase
# 3               Generic_Transcription_Pathway miR-targeted_genes_in_lymphocytes_-_TarBase
# 4                                        <NA> miR-targeted_genes_in_muscle_cell_-_TarBase

答案 1 :(得分:1)

如果在使用cbind之前将因子转换为字符,则不会获得数值:

    testFrame <- data.frame(cbind(as.character(PWe1), as.character(PWe3))

如果两个向量的长度不同,则cbind会发出警告,并且将复制较短向量的元素。如果您的情况不满意,可能data.frame对象可能不是正确的选择?

答案 2 :(得分:1)

l1 <- max(length(v1), length(v2))
length(v1) <- l1
length(v2) <- l1
cbind(as.character(v1), as.character(v2))
#     [,1]                                          [,2]                                         
#[1,] "Gene_Expression"                             "miR-#targeted_genes_in_epithelium_-_TarBase" 
#[2,] "miR-targeted_genes_in_muscle_cell_-_TarBase" "miR-#targeted_genes_in_leukocytes_-_TarBase" 
#[3,] "Generic_Transcription_Pathway"               "miR-#targeted_genes_in_lymphocytes_-_TarBase"
#[4,] NA                                            "miR-#targeted_genes_in_muscle_cell_-_TarBase"