R - 将2列数据帧重新整形为多列矩阵

时间:2018-02-21 01:14:26

标签: r dataframe matrix reshape

道歉,如果这个问题重复,但我无法找到它。

我希望重新整形表格中的数据框(从read_bulk读入):

"name.a", 5
"name.a", 4
"name.a", 1
"name.b", 2
"name.b", 3
"name.b", 2
"name.c", 1
"name.c", 5
"name.c", 6

进入表格:

5, 4, 1
2, 3, 2
1, 5, 6

真实数据框由每个名称的数千个数字组成,我不知道每个数字的数字,但它们都是相等的。每个名称在最终形式中都是不同的行。

我尝试重塑,但似乎无法让这个工作,任何想法?

4 个答案:

答案 0 :(得分:2)

 unstack(dat,V2~V1)
  name.a name.b name.c
1      5      2      1
2      4      3      5
3      1      2      6

使用其他图书馆:

library(tidyverse)
dat%>%group_by(V1)%>%mutate(id2=1:n())%>%spread(id2,V2)
# A tibble: 3 x 4
# Groups:   V1 [3]
      V1   `1`   `2`   `3`
*  <chr> <int> <int> <int>
1 name.a     5     4     1
2 name.b     2     3     2
3 name.c     1     5     6

数据:

dat=read.table(h=F,sep=",",stringsAsFactors = F,strip.white = T,text=' "name.a", 5
"name.a", 4
               "name.a", 1
               "name.b", 2
               "name.b", 3
               "name.b", 2
               "name.c", 1
               "name.c", 5
               "name.c", 6')

答案 1 :(得分:1)

如果格式总是相同的,那么基础R?

就是这样的
df <- as.data.frame(matrix(unlist(df[, 2]), ncol = 3, byrow = T));
df;
#  V1 V2 V3
#1  5  4  1
#2  2  3  2
#3  1  5  6

说明:unlist(df[, 2])df[, 2]中的条目转换为向量,然后重新格式化为matrixncol = 3,最后转换为data.frame

样本数据

df <- read.table(text =
    "name.a 5
name.a 4
name.a 1
name.b 2
name.b 3
name.b 2
name.c 1
name.c 5
name.c 6")

答案 2 :(得分:1)

您可以使用dplyrreshape2

进行转换
df <- data.frame(name=c("name.a",
                        "name.a",
                        "name.a",
                        "name.b",
                        "name.b",
                        "name.b",
                        "name.c",
                        "name.c",
                        "name.c"),
                 num=c(5,
                         4,
                         1,
                         2,
                         3,
                         2,
                         1,
                         5,
                         6))

df <- df %>%
  group_by(name) %>%
  mutate(instance = 1:n())

dcast(df,name~instance,sum,value.var='num')

答案 3 :(得分:1)

在查看回复之后,我认为我发现了一种更快捷的方式(更简单)。使用我设法使用的数据:

setwd("~/Documents/Random/abs") # data here
a = read_bulk(directory = ".") # read in as i did
df = unstack(a) # line i was looking for
dat = as.matrix(df) # to matrix
matplot(dat, lty = 1, type = 'l', lwd = 1, xlab = "Energy (keV)", ylab = "Counts") # plot