融化与融合dcast w / string concatenation

时间:2015-09-15 22:09:40

标签: r reshape reshape2 melt

假设我有以下data.frame:

foo <- data.frame(CONTACT_DATE = c(rep(as.Date("2015-09-15"),3), rep(as.Date("2015-09-16"),3)), ISSUE = c("abc", "def", "xyz", "abc", "xyz", "def"), ISSUE_COUNT = c(1000,750,100,1500,200,100), RANK = c(1,2,3,1,2,3))
> foo
  CONTACT_DATE ISSUE ISSUE_COUNT RANK
1   2015-09-15   abc        1000    1
2   2015-09-15   def         750    2
3   2015-09-15   xyz         100    3
4   2015-09-16   abc        1500    1
5   2015-09-16   xyz         200    2
6   2015-09-16   def         100    3     

如何从上面开始:

CONTACT_DATE ISSUE_RANK_1 ISSUE_RANK_2 ISSUE_RANK_3
2015-09-15   abc (1000)   def (750)    xyz (100)
2015-09-16   abc (1500)   xyz (200)    def (100)

我相信我必须使用melt&amp;来自dcast的{​​{1}},但我还没有弄明白。

2 个答案:

答案 0 :(得分:5)

您可以使用dplyrtidyr

library(dplyr)
library(tidyr)

foo %>%
  mutate(ISSUE_COUNT = paste0("(", ISSUE_COUNT, ")"),
         RANK = paste0("ISSUE_RANK_", RANK)) %>%
  unite(VAR, ISSUE, ISSUE_COUNT, sep = " ") %>%
  spread(RANK, VAR)

给出了:

#  CONTACT_DATE ISSUE_RANK_1 ISSUE_RANK_2 ISSUE_RANK_3
#1   2015-09-15   abc (1000)    def (750)    xyz (100)
#2   2015-09-16   abc (1500)    xyz (200)    def (100)

答案 1 :(得分:3)

默认情况下,dcast使用输入表的最后一列作为输出表中的值。

library(reshape2)
d = read.table(text="id CONTACT_DATE ISSUE ISSUE_COUNT RANK
1   2015-09-15   abc        1000    1
2   2015-09-15   def         750    2
3   2015-09-15   xyz         100    3
4   2015-09-16   abc        1500    1
5   2015-09-16   xyz         200    2
6   2015-09-16   def         100    3", header=T)
d$x = paste(d$ISSUE, paste0("(",d$ISSUE_COUNT,")"))   # create new column with values that will appear in table
dcast(CONTACT_DATE ~ RANK, data=d)

输出:

  CONTACT_DATE          1         2         3
1   2015-09-15 abc (1000) def (750) xyz (100)
2   2015-09-16 abc (1500) xyz (200) def (100)