因此,标题解释我想创建一个数据框。看看将用作矩阵的向下:
structure(c("2", "3", "8", "8", "10", "10", "11", "11", "11",
"11", "Frank", "Mark", "Greg", "Mati", "Paul",
"Cyntha", "Marcus", "Pablo", "Maggy", "Trist"
), .Dim = c(10L, 2L), .Dimnames = list(NULL, c("i", "vec_names"
)))
因此,我想根据列i
中的值创建列。如果列i
中的数字相同,则意味着可以在下一列中找到的两个名称应存储在新数据框中的一列中。
当然,这意味着列的长度会有所不同,因此缺少的字符串"可以填补NAs。
期望的输出:
2 3 8 10 11
Frank Mark Greg Paul Marcus
Mati Cyntha Pablo
Maggy
Trist
答案 0 :(得分:3)
您可以使用reshape2' dcast
重塑为广角:
DF = data.frame(m)
library(reshape2)
DF$s <- ave(DF$i, DF$i, FUN = seq_along)
res <- dcast(DF, s ~ i, value.var = "vec_names")
s 10 11 2 3 8
1 1 Paul Marcus Frank Mark Greg
2 2 Cyntha Pablo <NA> <NA> Mati
3 3 <NA> Maggy <NA> <NA> <NA>
4 4 <NA> Trist <NA> <NA> <NA>
不幸的是,您有一个不需要的列,s
,其他列按字典顺序排序。如果你想解决这个问题:
res$s <- NULL
res[order(as.integer(names(res)))]
2 3 8 10 11
1 Frank Mark Greg Paul Marcus
2 <NA> <NA> Mati Cyntha Pablo
3 <NA> <NA> <NA> <NA> Maggy
4 <NA> <NA> <NA> <NA> Trist
答案 1 :(得分:2)
在基础R中,首先将矩阵(mymat
)转换为data.frame,您可以尝试以下方法:
df <- as.data.frame(mymat, stringsAsFactors=FALSE) # convert your df to a data.frame
sp_df <- split(df, df$i) # split it according to "i"
nb_row <- sapply(sp_df, nrow) # compute the number of rows in each so you can complete with NAs
mapply(function(x, y) c(x$vec_names, rep(NA, max(nb_row)-y)),
x=sp_df,
y=nb_row) [, order(as.numeric(names(sp_df)))] # complete with NA when needed and keep only the second column. Finally, reorder the columns.
编辑
感谢@Frank,这里有一个更简单的方法,只拆分名称向量(在转换为data.frame之后):
sp_nm = split(df$vec_names, df$i)
do.call(cbind, lapply(sp_nm, `length<-`, max(lengths(sp_nm))))[, order(as.numeric(names(sp_nm)))]
两种方式都提供以下输出
# 2 3 8 10 11
#[1,] "Frank" "Mark" "Greg" "Paul" "Marcus"
#[2,] NA NA "Mati" "Cyntha" "Pablo"
#[3,] NA NA NA NA "Maggy"
#[4,] NA NA NA NA "Trist"
答案 2 :(得分:0)
尝试包tidyr的传播功能。这将接近您的期望。
spread(data.frame(
structure(c("2", "3", "8", "8", "10", "10", "11", "11", "11",
"11", "Frank", "Mark", "Greg", "Mati", "Paul",
"Cyntha", "Marcus", "Pablo", "Maggy", "Trist"),
.Dim = c(10L, 2L), .Dimnames = list(NULL, c("i", "vec_names")))),
"i", "vec_names")
10 11 2 3 8
1 <NA> <NA> Frank <NA> <NA>
2 <NA> <NA> <NA> Mark <NA>
3 <NA> <NA> <NA> <NA> Greg
4 <NA> <NA> <NA> <NA> Mati
5 Paul <NA> <NA> <NA> <NA>
6 Cyntha <NA> <NA> <NA> <NA>
7 <NA> Marcus <NA> <NA> <NA>
8 <NA> Pablo <NA> <NA> <NA>
9 <NA> Maggy <NA> <NA> <NA>
10 <NA> Trist <NA> <NA> <NA>