R stats,转换数据矩阵的部分

时间:2016-01-05 21:12:33

标签: r transform transpose

我是StackOverflow和R stats的新手,所以请耐心等待。 我在SAS编程方面有很多经验,但我正在努力学习R. 我通常使用SAS和R来转换大型数据集,我有一个按研究地点矩阵划分的物种如下:

Species Status Role Site1 Site2 Site3...Site25
A_a      S      P     0     0     0       1
A_b      SO     X     1     25    0       0
B_a      S      P     0      2    1       1
B_b      S      X     0      1    0       0 ...

我想转换这个表并创建两个名为" Site"和"伯爵"基于站点变量名称和每个站点内的计数数据:

Species Status Role Site Count
A_a      S      P   Site1  0
A_a      S      P   Site2  0
A_a      S      P   Site3  0
A_a      S      P   Site25 1
A_b      SO     X   Site1  1
A_b      SO     X   Site2  25
A_b      SO     X   Site3  0
A_b      SO     X   Site25 0 ...
B_b      S      X   Site25 0

我认为这可能超出了简单的t()函数,并且已经查看了包重构和reshape2,但是对于如何继续而言有点迷失。有没有人会有这样的情况,可以帮助编码? 谢谢,JimH

2 个答案:

答案 0 :(得分:1)

你可以使用dplyr / tidyr这样做:

install.packages(c("tidyr", "dplyr"), dependencies = TRUE)
library(dplyr)
library(tidyr)
df %>% gather(Site, Count, grep('Site', names(df))) %>% arrange(Species)

答案 1 :(得分:1)

或者基础R中有点老派(我意识到代码可以更简洁,请随意优化),

df <- structure(list(Species = structure(1:4, .Label = c("A_a", "A_b", 
"B_a", "B_b"), class = "factor"), Status = structure(c(1L, 2L, 
1L, 1L), .Label = c("S", "SO"), class = "factor"), Role = structure(c(1L, 
2L, 1L, 2L), .Label = c("P", "X"), class = "factor"), Site1 = c(0L, 
1L, 0L, 0L), Site2 = c(0L, 25L, 2L, 1L), Site3 = c(0L, 0L, 1L, 
0L)), .Names = c("Species", "Status", "Role", "Site1", "Site2", 
"Site3"), class = "data.frame", row.names = c(NA, -4L))
df
#>   Species Status Role Site1 Site2 Site3
#> 1     A_a      S    P     0     0     0
#> 2     A_b     SO    X     1    25     0
#> 3     B_a      S    P     0     2     1
#> 4     B_b      S    X     0     1     0

 reshape(df, 
   varying = c("Site1", "Site2", "Site3"), 
   v.names = "Count",
   timevar = "Site", 
   times = c("Site1", "Site2", "Site3"), 
   new.row.names = 1:1000,
   direction = "long")
#>   Species Status Role  Site Count id
#> 1      A_a      S    P Site1     0  1
#> 2      A_b     SO    X Site1     1  2
#> 3      B_a      S    P Site1     0  3
#> 4      B_b      S    X Site1     0  4
#> 5      A_a      S    P Site2     0  1
#> 6      A_b     SO    X Site2    25  2
#> 7      B_a      S    P Site2     2  3
#> 8      B_b      S    X Site2     1  4
#> 9      A_a      S    P Site3     0  1
#> 10     A_b     SO    X Site3     0  2
#> 11     B_a      S    P Site3     1  3
#> 12     B_b      S    X Site3     0  4