重塑数据框架

时间:2012-11-20 12:15:43

标签: r dataframe reshape

我正在尝试使用融合amb转换来转换此数据框

 knowngene                                           Meth
 uc003fia.3 cg00000108;0.864484486796394;0.928944704280193
 uc003cha.4 cg00000108;0.864484486796394;0.928944704280193
 uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076
 uc003fhz.4 cg00000132;0.881060551674426;0.910939682196076
 uc003fia.3 cg00000109;0.881060551674426;0.910939682196076
 uc003fia.3 cg00000236;0.799251070221749;0.898656886868738

在这样的事情

 knowngene                                           Meth
 uc003fia.3 cg00000108;0.864484486796394;0.928944704280193;cg00000109;0.881060551674426;0.910939682196076;cg00000236;0.799251070221749;0.898656886868738
 uc003cha.4 cg00000108;0.864484486796394;0.928944704280193
 uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076;cg00000132;0.881060551674426;0.910939682196076

但由于特殊原因,我无法重塑数据框,可能先改为列表?

3 个答案:

答案 0 :(得分:2)

拆分和申请会让你关闭:

lapply(split(x$Meth, x$knowngene), paste, collapse="; ")

$uc003cha.4
[1] "cg00000108;0.864484486796394;0.928944704280193"

$uc003fhz.4
[1] "cg00000109;0.881060551674426;0.910939682196076; cg00000132;0.881060551674426;0.910939682196076"

$uc003fia.3
[1] "cg00000108;0.864484486796394;0.928944704280193; cg00000109;0.881060551674426;0.910939682196076; cg00000236;0.799251070221749;0.898656886868738"

结果是一个命名列表,其中所有文本以您想要的方式连接在一起。您可以使用names()unname()

将其转换为数据框
data.frame(knowngene=names(x), Meth=unlist(unname(x)))

   knowngene
1 uc003cha.4
2 uc003fhz.4
3 uc003fia.3
                                                                                                                                            Meth
1                                                                                                 cg00000108;0.864484486796394;0.928944704280193
2                                                 cg00000109;0.881060551674426;0.910939682196076; cg00000132;0.881060551674426;0.910939682196076
3 cg00000108;0.864484486796394;0.928944704280193; cg00000109;0.881060551674426;0.910939682196076; cg00000236;0.799251070221749;0.898656886868738

答案 1 :(得分:1)

尝试

cast(knowngene ~ ., data = your.data.frame, value = "Meth", 
    function = paste, sep = ";")

答案 2 :(得分:1)

听起来你只需要aggregate()

首先,您的数据:

myDF <- read.table(header = TRUE, text = "knowngene   Meth
uc003fia.3 cg00000108;0.864484486796394;0.928944704280193
uc003cha.4 cg00000108;0.864484486796394;0.928944704280193
uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076
uc003fhz.4 cg00000132;0.881060551674426;0.910939682196076
uc003fia.3 cg00000109;0.881060551674426;0.910939682196076
uc003fia.3 cg00000236;0.799251070221749;0.898656886868738")

第二,聚合:

aggregate(Meth ~ knowngene, myDF, paste, collapse=";")
#    knowngene                                                                                                                                         Meth
# 1 uc003cha.4                                                                                               cg00000108;0.864484486796394;0.928944704280193
# 2 uc003fhz.4                                                cg00000109;0.881060551674426;0.910939682196076;cg00000132;0.881060551674426;0.910939682196076
# 3 uc003fia.3 cg00000108;0.864484486796394;0.928944704280193;cg00000109;0.881060551674426;0.910939682196076;cg00000236;0.799251070221749;0.898656886868738