Question

我有一个名为questions

的因素数据框

q1 q2 q3
A  A  B
C  A  A
A  B  C

我想重塑成

question answer freq
1        A      2
1        B      0
1        C      1
2        A      2
2        B      1
2        C      0
3        A      1
3        B      1
3        C      1

我觉得应该有一种方法可以使用reshape2或plyr，但我无法弄明白。

相反，我做了以下事情：

tbl <- data.frame()
for(i in 1:dim(questions)[2]){
    subtable <- cbind(question = rep(i, 3),
                      as.data.frame(table(questions[i])))
    tbl <- rbind(tbl, subtable)
}

是否有更简洁的方法来重塑此表？

Answer 1

这是一个基本R方法，其概念与@akrun发布的方法类似。我对清理工作感到困扰，因为这主要是整容，并且与问题的概念无关。

一般方法是：

data.frame(table(stack(mydf))

但是，stack无法使用factor，因此，如果您的数据为factor而非character，则必须使用as.character首先，像这样：

data.frame(table(stack(lapply(mydf, as.character))))
#   values ind Freq
# 1      A  q1    2
# 2      B  q1    0
# 3      C  q1    1
# 4      A  q2    2
# 5      B  q2    1
# 6      C  q2    0
# 7      A  q3    1
# 8      B  q3    1
# 9      C  q3    1

远离＆＃34; plyr＆＃34;和＆＃34; reshape2＆＃34;而是转向＆＃34; dplyr＆＃34;和＆＃34; tidyr＆＃34;，您可以尝试：

library(dplyr)
library(tidyr)

mydf %>% 
  gather(question, answer, everything()) %>%  ## Get the data into a long form
  group_by(question, answer) %>%              ## Group by both question and answer columns
  summarise(freq = n()) %>%                   ## Calculate the relevant frequency
  right_join(expand(., question, answer))     ## Merge with all combinations of Qs and As
# Joining by: c("question", "answer")
# Source: local data frame [9 x 3]
# Groups: question
# 
#   question answer freq
# 1       q1      A    2
# 2       q1      B   NA
# 3       q1      C    1
# 4       q2      A    2
# 5       q2      B    1
# 6       q2      C   NA
# 7       q3      A    1
# 8       q3      B    1
# 9       q3      C    1

Answer 2

尝试

library(qdapTools)
library(reshape2)
colnames(questions) <- sub('\\D+', '', colnames(questions))
setNames(melt(as.matrix(mtabulate(questions))), 
                      c('question', 'answer', 'freq'))

或使用data.table

library(data.table)#v.1.9.5+
setkey(
    setnames(
      melt(setDT(questions, keep.rownames=TRUE), id.var='rn',
             value.name='answer')[, list(freq=.N),
                  by=list(variable, answer)],
           'variable', 'question'), 
                  question, answer)[
       CJ(question=unique(question), answer=unique(answer))][
                 is.na(freq), freq:=0][]
 #   question answer freq
 #1:        1      A    2
 #2:        1      B    0
 #3:        1      C    1
 #4:        2      A    2
 #5:        2      B    1
 #6:        2      C    0
 #7:        3      A    1
 #8:        3      B    1
 #9:        3      C    1

Answer 3

是的，由于零点，它有点棘手。熔化后，不要直接浇铸成您需要的形状，浇铸成宽的形状然后再熔化。使用基础R和table可能同样容易。

d <- read.table(text="q1 q2 q3
                       A  A  B
                       C  A  A
                       A  B  C", header=TRUE, as.is=TRUE)
melt(dcast(melt(d, measure.vars=1:3), value ~ variable))

## Aggregation function missing: defaulting to length
## Using value as id variables
##   value variable value
## 1     A       q1     2
## 2     B       q1     0
## 3     C       q1     1
## 4     A       q2     2
## 5     B       q2     1
## 6     C       q2     0
## 7     A       q3     1
## 8     B       q3     1
## 9     C       q3     1

用R重塑表格 - 更好的方法？

3 个答案: