将矩阵列表转换为频率表

时间:2015-12-06 05:20:19

标签: r list matrix

我有一个矩阵列表。每个矩阵都是调查的结果。结果看起来像这样(对于每个问题,只能选择一个答案):

[[1]]
   Q1 Q2 Q3 Q4
A1  1  1  1  1
A2  0  0  0  0
A3  0  0  0  0

[[2]]
   Q1 Q2 Q3 Q4
A1  0  1  1  0
A2  1  0  0  1
A3  0  0  0  0

[[3]]
   Q1 Q2 Q3 Q4
A1  1  0  0  1
A2  0  0  0  0
A3  0  1  1  0

现在我想取这些结果,并将它们变成一个看起来像这样的格式(HairEyeColor是一个多维数组来启动[内置r数据集]):

> as.data.frame.table(HairEyeColor)
    Hair   Eye    Sex Freq
1  Black Brown   Male   32
2  Brown Brown   Male   53
3    Red Brown   Male   10
4  Blond Brown   Male    3
5  Black  Blue   Male   11
6  Brown  Blue   Male   50
7    Red  Blue   Male   10
8  Blond  Blue   Male   30
9  Black Hazel   Male   10
......

对于粘贴者(但请注意,每个调查的问题数量和答案数量会有所不同):

    Q1    Q2      Q3    Q4   Freq
1  Q1A1  Q2A1    Q3A1  Q4Al    1
2  Q1A2  Q2A1    Q3A1  Q4Al    1
3  Q1A2  Q2A2    Q3A1  Q4Al    2
4  Q1A2  Q2A2    Q3A1  Q4Al    1
5  Q1A2  Q2A1    Q3A2  Q4Al    4
6  Q1A2  Q2A1    Q3A1  Q4A2    4
7  Q1A2  Q2A2    Q3A2  Q4A1    2
8  Q1A2  Q2A2    Q3A1  Q4A2    4
9  Q1A2  Q2A2    Q3A2  Q4A2    1
...

如果我必须自己动手,我想我可以强制它并使用每种可能的组合作为键/值对的关键。

但我不知道从哪里开始。我猜我已经有一个功能来处理这个,我似乎无法找到任何想法?

1 个答案:

答案 0 :(得分:0)

好吧,这就是我最终创作的功能。不得不学习很多新技巧。

createCombinationArray = function(data) 
{
  dataLength = length(data)

  if(dataLength <= 0)
    return(NULL)

  d = dim(data[[1]])

  numQuestions = d[1]
  numAnswers = d[2]      

  #the length of size is the number of dimensions in multi
  #the values of size represent the number of rows
  size = rep.int(numAnswers, numQuestions)

  #create a zero'ed out multid-array
  multi = array(0, size)

   #Loop through each survey matrix
   for(i in 1:dataLength)
   {
      indices = rep(1, numQuestions)
      #loop through each question in the survey
       for(j in 1:numQuestions)
       {
         #loop through each answer and determine selected value
         #(in my case its where a 1 exists or not 0)
         for(k in 1:numAnswers)
         {
           if((data[[i]][j, k]) != 0)
             #build up the indices that will be used to update the multid-array
             indices[j] = k
         }
       }
      #use a matrix to access the appropriate dimensions
      #this way the dimensions can be defined dynamically (e.g. not hard coded like "multi[x,y,z,w]")
      multi[matrix(indices, 1)] = multi[matrix(indices, 1)] + 1
   }

  return(multi)
}

获得所需的格式:

createCombinationTable = function(data)
{
  return(as.data.frame.table(createCombinationArray(data)))
}