在数据框

时间:2015-08-20 07:30:30

标签: r

我有这样的数据框:

structure(list(P1 = c("Mark", "Katrin", "Kate", "Hank", "Tom", 
"Marcus"), P2 = c("Tim", "Greg", "Seba", "Teqa", "Justine", "Monica"
), clique = structure(list(`930` = integer(0), `2090` = integer(0), 
    `3120` = c(2L, 3L, 231L), `3663` = integer(0), `3704` = integer(0), 
    `4156` = c(19L, 27L)), .Names = c("930", "2090", "3120", 
"3663", "3704", "4156"), class = "AsIs")), .Names = c("P1", "P2", 
"clique"), row.names = c(930L, 2090L, 3120L, 3663L, 3704L, 4156L
), class = "data.frame")

我的最后一栏名为clique,我遇到了问题。我想将此列转换为由一列分隔的数值,或者最佳选项是将integer(0)转换为NAs并将数字放在单独的列中。只需在每列中保留一个数字。 我会接受这两种解决方案。

示例数据:

P1  P2  clique
Mark    Tim integer(0)
Katrin  Greg    integer(0)
Kate    Seba    c(2, 3, 231)
Hank    Teqa    integer(0)
Tom Justine integer(0)
Marcus  Monica  c(19, 27)

> class(data$clique)
[1] "AsIs"

期望的输出:

 P1      P2    clique
Mark     Tim      NA    
Katrin   Greg     NA    
Kate     Seba    2,3,231
Hank     Teqa     NA
Tom      Justine  NA    
Marcus   Monica  19,27

P1      P2    clique New_column1 New_column2
Mark    Tim          
Katrin  Greg          
Kate    Seba      2        3          231
Hank    Teqa          
Tom     Justine          
Marcus  Monica    19       27

1 个答案:

答案 0 :(得分:2)

您可以在我的" splitstackshape"中尝试listCol_w包:

library(splitstackshape)
listCol_w(mydf, "clique")[, lapply(.SD, as.numeric), by = .(P1, P2)]
##        P1      P2 clique_fl_1 clique_fl_2 clique_fl_3
## 1:   Mark     Tim          NA          NA          NA
## 2: Katrin    Greg          NA          NA          NA
## 3:   Kate    Seba           2           3         231
## 4:   Hank    Teqa          NA          NA          NA
## 5:    Tom Justine          NA          NA          NA
## 6: Marcus  Monica          19          27          NA

我推荐这个,因为你提到你想要数值。您无法存储类似于" 2,3,231"作为数值。

如果您仍想尝试折叠值然后拆分它们的方法,可以尝试:

mydf$clique <- vapply(mydf$clique, function(x) paste(x, collapse = ","), character(1L))

str会显示您现在有一个字符串而不是list个字符向量。然后,您可以使用cSplit来获取广泛的格式。

> str(mydf)
'data.frame':   6 obs. of  3 variables:
 $ P1    : chr  "Mark" "Katrin" "Kate" "Hank" ...
 $ P2    : chr  "Tim" "Greg" "Seba" "Teqa" ...
 $ clique: chr  "" "" "2,3,231" "" ...