使用plyr在R中按列进行转置

时间:2014-04-04 22:12:03

标签: r dataframe plyr transpose

这是我的data.frame,名为test

    strain  variable    value       L1
1   AB1            n    582.00000   1
2   AB4            n    12.00000    1
3   CB4852         n    375.00000   1
4   CB4853         n    113.00000   1
5   CB4854         n    160.00000   1

这是一个融化的data.frame,其中L1变为1-30,每个L1和96个变量有78个变量......总共219,552行。

我想做的是获取此data.frame(测试)并创建L1(30)X变量(78)具有以下方向的新data.frames:

L1_variable(这将是一个df的名称)

               strains1  strain2 .... strainN
    row.name     value     value        value
    variable x   value     value        value

因此为每个L1和变量创建一个新的df,它具有每个菌株列的给定变量的值。

这些将被放入一个函数中。

我在想一个函数需要创建然后在我的df测试中使用ddply,但我不知道如何实现它。

感谢任何和所有帮助

1 个答案:

答案 0 :(得分:0)

没有必要创建单独的数据帧。您可以按如下方式重新整形数据框:

# creating sample data (extending your sample in order to be able to illustrate the method
df <- structure(list(strain = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c("AB1", "AB4", "CB4852", "CB4853", "CB4854"), class = "factor"), variable = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("m", "n"), class = "factor"), value = c(582, 12, 375, 113, 160, 753, 92, 115, 163, 189, 462, 72, 305, 183, 360, 142, 132, 75, 308, 216), L1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), .Names = c("strain", "variable", "value", "L1"), class = "data.frame", row.names = c(NA, -20L))

# transforming the data with the reshape2 package
require(reshape2)
df2 <- dcast(df, L1 + variable ~ strain, value.var="value")

# creating a variable with unique identifiers
df2$L1var <- paste0(df2$L1, df2$variable)

这导致以下数据帧:

df2 <- structure(list(L1 = c(1L, 1L, 2L, 2L), variable = structure(c(1L, 2L, 1L, 2L), .Label = c("m", "n"), class = "factor"), AB1 = c(753, 582, 142, 462), AB4 = c(92, 12, 132, 72), CB4852 = c(115, 375, 75, 305), CB4853 = c(163, 113, 308, 183), CB4854 = c(189, 160, 216, 360), L1var = c("1m", "1n", "2m", "2n")), .Names = c("L1", "variable", "AB1", "AB4", "CB4852", "CB4853", "CB4854", "L1var"), row.names = c(NA, -4L), class = "data.frame")

如果您想为每个唯一标识符分配单独的文件,可以像这样分割df2

# split dataframe in list of dataframes
dfs <- split(df2, df2$L1var)

# save each dataframe in the list to a seperate file
lapply(seq_along(dfs), function(i)write.csv(dfs[i], file = paste0(names(dfs)[i],'.csv')))