Tidy data Melt and Cast

时间:2015-07-08 15:58:13

标签: r reshape reshape2 melt

In the Wickham's Tidy Data pdf he has an example to go from messy to tidy data.

I wonder where the code is?

For example, what code is used to go from

Table 1: Typical presentation dataset.

to

Table 3: The same data as in Table 1 but with variables in columns and observations in rows.

Per haps melt or cast. But from http://www.statmethods.net/management/reshape.html I cant see how.

(Note to self: Need it for GDPpercapita...)

1 个答案:

答案 0 :(得分:2)

答案取决于数据的结构。在你所链接的论文中,哈德利写的是关于"重塑"和" reshape2"包。

数据结构在"表1和#34;中的含义不明确。从描述来看,它听起来像matrix,带有命名的dimnames(就像我在mymat中所示)。在这种情况下,简单的melt可以起作用:

library(reshape2)
melt(mymat)
#           Var1       Var2 value
# 1   John Smith treatmenta     —
# 2     Jane Doe treatmenta    16
# 3 Mary Johnson treatmenta     3
# 4   John Smith treatmentb     2
# 5     Jane Doe treatmentb    11
# 6 Mary Johnson treatmentb     1

如果它不是矩阵,而是带有data.frame的{​​{1}},您仍然可以使用{{1}之类的内容来使用row.name 方法 }}

另一方面,如果"名称"是matrix中的一列(因为它们位于" tidyr"插图中,您需要指定melt(as.matrix(mymat))data.frame以便{{1} }知道如何处理列。

id.vars

街区的新生儿是" tidyr"。 " tidyr"包与measure.vars一起使用,因为它通常与melt一起使用。我不会重现" tidyr"的代码。在这里,因为the vignette已充分涵盖了这一点。

示例数据:

melt(mydf, id.vars = "name")
#           name   variable value
# 1   John Smith treatmenta     —
# 2     Jane Doe treatmenta    16
# 3 Mary Johnson treatmenta     3
# 4   John Smith treatmentb     2
# 5     Jane Doe treatmentb    11
# 6 Mary Johnson treatmentb     1