我有这样的数据框:
structure(list(P1 = c("Mark", "Katrin", "Kate", "Hank", "Tom",
"Marcus"), P2 = c("Tim", "Greg", "Seba", "Teqa", "Justine", "Monica"
), clique = structure(list(`930` = integer(0), `2090` = integer(0),
`3120` = c(2L, 3L, 231L), `3663` = integer(0), `3704` = integer(0),
`4156` = c(19L, 27L)), .Names = c("930", "2090", "3120",
"3663", "3704", "4156"), class = "AsIs")), .Names = c("P1", "P2",
"clique"), row.names = c(930L, 2090L, 3120L, 3663L, 3704L, 4156L
), class = "data.frame")
我的最后一栏名为clique
,我遇到了问题。我想将此列转换为由一列分隔的数值,或者最佳选项是将integer(0)
转换为NAs并将数字放在单独的列中。只需在每列中保留一个数字。
我会接受这两种解决方案。
示例数据:
P1 P2 clique
Mark Tim integer(0)
Katrin Greg integer(0)
Kate Seba c(2, 3, 231)
Hank Teqa integer(0)
Tom Justine integer(0)
Marcus Monica c(19, 27)
> class(data$clique)
[1] "AsIs"
期望的输出:
P1 P2 clique
Mark Tim NA
Katrin Greg NA
Kate Seba 2,3,231
Hank Teqa NA
Tom Justine NA
Marcus Monica 19,27
或
P1 P2 clique New_column1 New_column2
Mark Tim
Katrin Greg
Kate Seba 2 3 231
Hank Teqa
Tom Justine
Marcus Monica 19 27
答案 0 :(得分:2)
您可以在我的" splitstackshape"中尝试listCol_w
包:
library(splitstackshape)
listCol_w(mydf, "clique")[, lapply(.SD, as.numeric), by = .(P1, P2)]
## P1 P2 clique_fl_1 clique_fl_2 clique_fl_3
## 1: Mark Tim NA NA NA
## 2: Katrin Greg NA NA NA
## 3: Kate Seba 2 3 231
## 4: Hank Teqa NA NA NA
## 5: Tom Justine NA NA NA
## 6: Marcus Monica 19 27 NA
我推荐这个,因为你提到你想要数值。您无法存储类似于" 2,3,231"作为数值。
如果您仍想尝试折叠值然后拆分它们的方法,可以尝试:
mydf$clique <- vapply(mydf$clique, function(x) paste(x, collapse = ","), character(1L))
str
会显示您现在有一个字符串而不是list
个字符向量。然后,您可以使用cSplit
来获取广泛的格式。
> str(mydf)
'data.frame': 6 obs. of 3 variables:
$ P1 : chr "Mark" "Katrin" "Kate" "Hank" ...
$ P2 : chr "Tim" "Greg" "Seba" "Teqa" ...
$ clique: chr "" "" "2,3,231" "" ...