将熔融数据帧转换为R中的矩阵

时间:2013-08-22 20:33:14

标签: r matrix dataframe reshape2

我想编写一个将数据帧转换为矩阵的函数。数据框是事件列表。每行对应一个访问或购买产品的人。

my.df <- data.frame(person = c('A', 'A', 'B', 'B', 'B', 'C'),
                    week = c(1, 2, 1, 3, 3, 2),
                    event = c('visit', 'buy', 'visit', 'visit', 'buy', 'visit'))
> my.df
  person week event
1      A    1 visit
2      A    2   buy
3      B    1 visit
4      B    3 visit
5      B    3   buy
6      C    2 visit

所需的输出矩阵将行作为人,将列作为周。在(人,周)条目中,如果有人购买,我想要“买”,如果没有,我想要访问的人“访问”,否则我想要“无”作为条目。更具体地说,所需的输出是以下矩阵:

> my.mat
  1       2      3      
A "visit" "buy"  "none" 
B "visit" "none" "buy"  
C "none"  "none" "visit"

我认为我应该将事件转换为数字,使用max执行转换,然后将数字转换回事件,但我不完全确定如何将它们放在一起。

3 个答案:

答案 0 :(得分:2)

正如Arun指出的那样,使用reshape2包:

library(reshape2)

# there is a variety of ways to get the precedence you like
# I chose to just sort the strings
acast(my.df, person ~ week, function(x) {sort(as.character(x))[1]},
      value.var = 'event', fill = 'none')
#  1       2       3     
#A "visit" "buy"   "none"
#B "visit" "none"  "buy" 
#C "none"  "visit" "none"

答案 1 :(得分:1)

只是一段代码:

unique(event)
as.numeric(factor(event))
unique(event)[as.numeric(factor(event)[1])]

第一行显示您有多少不同的事件。第二个将您的事件转换为数字。第三个将给出相对于编号元素的文本(这里为1)。

答案 2 :(得分:1)

在@eddi和@Rodrigo的答案的基础上,我设法找到下面的代码有点冗长,但有效。如果我想要更复杂的事件排序,它也可以工作。

require(reshape2) # For acast(...)

# Input data frame
my.df <- data.frame(person = c('A', 'A', 'B', 'B', 'B', 'C'),
                    week = c(1, 2, 1, 3, 3, 2),
                    event = c('visit', 'buy', 'visit', 'visit', 'buy', 'visit'))

# Convert event into numbers, with buy > visit
the.levels <- c('visit', 'buy')
my.df$event <- as.numeric(factor(my.df$event, levels = the.levels))

# Build matrix
temp <- acast(my.df, person ~ week, function(x) {max(x)},
             value.var = 'event', fill = 0)

# Convert event numbers back into text
number.to.event <- as.list(setNames(c('none', 'visit', 'buy'),
                                as.character(c(0, 1, 2))))
# Save row names and column names
out <- matrix(number.to.event[as.character(temp)], nrow = 3,
              dimnames = dimnames(temp))

> out
  1       2       3     
A "visit" "buy"   "none"
B "visit" "none"  "buy" 
C "none"  "visit" "none"