Question

In the Wickham's Tidy Data pdf he has an example to go from messy to tidy data.

I wonder where the code is?

For example, what code is used to go from

Table 1: Typical presentation dataset.

to

Table 3: The same data as in Table 1 but with variables in columns and observations in rows.

Per haps melt or cast. But from http://www.statmethods.net/management/reshape.html I cant see how.

(Note to self: Need it for GDPpercapita...)

Answer 1

答案取决于数据的结构。在你所链接的论文中，哈德利写的是关于＆＃34;重塑＆＃34;和＆＃34; reshape2＆＃34;包。

数据结构在＆＃34;表1和＃34;中的含义不明确。从描述来看，它听起来像matrix，带有命名的dimnames（就像我在mymat中所示）。在这种情况下，简单的melt可以起作用：

library(reshape2)
melt(mymat)
#           Var1       Var2 value
# 1   John Smith treatmenta     —
# 2     Jane Doe treatmenta    16
# 3 Mary Johnson treatmenta     3
# 4   John Smith treatmentb     2
# 5     Jane Doe treatmentb    11
# 6 Mary Johnson treatmentb     1

如果它不是矩阵，而是带有data.frame的{{1}}，您仍然可以使用{{1}之类的内容来使用row.name 方法 }}

另一方面，如果＆＃34;名称＆＃34;是matrix中的一列（因为它们位于＆＃34; tidyr＆＃34;插图中，您需要指定melt(as.matrix(mymat))或data.frame以便{{1} }知道如何处理列。

id.vars

街区的新生儿是＆＃34; tidyr＆＃34;。＆＃34; tidyr＆＃34;包与measure.vars一起使用，因为它通常与melt一起使用。我不会重现＆＃34; tidyr＆＃34;的代码。在这里，因为the vignette已充分涵盖了这一点。

示例数据：

melt(mydf, id.vars = "name")
#           name   variable value
# 1   John Smith treatmenta     —
# 2     Jane Doe treatmenta    16
# 3 Mary Johnson treatmenta     3
# 4   John Smith treatmentb     2
# 5     Jane Doe treatmentb    11
# 6 Mary Johnson treatmentb     1

Tidy data Melt and Cast

1 个答案: