Question

我在r中有一个数据集，其中包含两列数字数据和一个带有标识符的数据集。一些行共享相同的标识符（即它们是相同的个体），但包含不同的数据。我想使用标识符将共享标识符的行从一行移动到列中。目前有600行，但应该有400行。

任何人都可以共享可能会执行此操作的代码吗？我是R的新手，并尝试过重塑（演员）程序，但我不能真正遵循它，我不确定这正是我想要做的。

感谢任何帮助。

更新：

当前

预期输出

ID Age Sex Age2 Sex2 Age3 Sex3 Age4 Sex4   
1   3   1   5    1     6    1    7    1
2   1   2   12   2     5    2
3   3   1

更新2：

到目前为止，我已尝试使用reshape2中的melt和dcast命令。我到了那里，但它仍然看起来不太正确。这是我的代码：

x <- melt(example, id.vars = "ID")

x$time <- ave(x$ID, x$ID, FUN = seq_along)

example2 <- dcast (x, ID ~ time, value.var = "value")

以下是使用该代码的输出：

ID  A   B   C    D     E    F    G    H (for clarity i have labelled these) 
1   3   5   6    7     1    1    1    1
2   1   12  5    2     2    2
3   3   1

因此，正如您可能看到的那样，它混合了“性”和“年龄”变量并将它们组合在同一列中。例如，列D对于人1（age4）具有值'7'，但对于人2（性别）具有'2'。我可以看到我的代码没有指示数值应该转换到哪里，但我不知道如何编写该部分。有任何想法吗？

Answer 1

以下是使用source .bash_profile包中的gather，spread和unite的方法：

tidyr

如果您希望将列保持为数字，则只需从suppressPackageStartupMessages(library(tidyverse)) x <- tribble( ~ID, ~Age, ~Sex, 1, 3, 1, 1, 5, 1, 1, 6, 1, 1, 7, 1, 2, 1, 2, 2, 12, 2, 2, 5, 2, 3, 3, 1 ) x %>% group_by(ID) %>% mutate(grp = 1:n()) %>% gather(var, val, -ID, -grp) %>% unite("var_grp", var, grp, sep ='') %>% spread(var_grp, val, fill = '') #> # A tibble: 3 x 9 #> # Groups: ID [3] #> ID Age1 Age2 Age3 Age4 Sex1 Sex2 Sex3 Sex4 #> * <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> #> 1 1 3 5 6 7 1 1 1 1 #> 2 2 1 12 5 2 2 2 #> 3 3 3 1中删除fill=''参数即可。

其他可能对此有所帮助的问题包括：

R spreading multiple columns with tidyr

How can I spread repeated measures of multiple variables into wide format?

Answer 2

我最近在数据中遇到了类似的问题，由于tidyr和gather已被淘汰，我想使用spread 1.0函数提供更新。当前，新的pivot_longer和pivot_wider比gather和spread慢得多，尤其是在非常大的数据集上，但据说此问题已在{{1}的下一次更新中得到修复}，因此希望此更新的解决方案对人们有用。

tidyr

使用标识符将行移动到R中的列

2 个答案: