我想编写一个将数据帧转换为矩阵的函数。数据框是事件列表。每行对应一个访问或购买产品的人。
my.df <- data.frame(person = c('A', 'A', 'B', 'B', 'B', 'C'),
week = c(1, 2, 1, 3, 3, 2),
event = c('visit', 'buy', 'visit', 'visit', 'buy', 'visit'))
> my.df
person week event
1 A 1 visit
2 A 2 buy
3 B 1 visit
4 B 3 visit
5 B 3 buy
6 C 2 visit
所需的输出矩阵将行作为人,将列作为周。在(人,周)条目中,如果有人购买,我想要“买”,如果没有,我想要访问的人“访问”,否则我想要“无”作为条目。更具体地说,所需的输出是以下矩阵:
> my.mat
1 2 3
A "visit" "buy" "none"
B "visit" "none" "buy"
C "none" "none" "visit"
我认为我应该将事件转换为数字,使用max执行转换,然后将数字转换回事件,但我不完全确定如何将它们放在一起。
答案 0 :(得分:2)
正如Arun指出的那样,使用reshape2
包:
library(reshape2)
# there is a variety of ways to get the precedence you like
# I chose to just sort the strings
acast(my.df, person ~ week, function(x) {sort(as.character(x))[1]},
value.var = 'event', fill = 'none')
# 1 2 3
#A "visit" "buy" "none"
#B "visit" "none" "buy"
#C "none" "visit" "none"
答案 1 :(得分:1)
只是一段代码:
unique(event)
as.numeric(factor(event))
unique(event)[as.numeric(factor(event)[1])]
第一行显示您有多少不同的事件。第二个将您的事件转换为数字。第三个将给出相对于编号元素的文本(这里为1)。
答案 2 :(得分:1)
在@eddi和@Rodrigo的答案的基础上,我设法找到下面的代码有点冗长,但有效。如果我想要更复杂的事件排序,它也可以工作。
require(reshape2) # For acast(...)
# Input data frame
my.df <- data.frame(person = c('A', 'A', 'B', 'B', 'B', 'C'),
week = c(1, 2, 1, 3, 3, 2),
event = c('visit', 'buy', 'visit', 'visit', 'buy', 'visit'))
# Convert event into numbers, with buy > visit
the.levels <- c('visit', 'buy')
my.df$event <- as.numeric(factor(my.df$event, levels = the.levels))
# Build matrix
temp <- acast(my.df, person ~ week, function(x) {max(x)},
value.var = 'event', fill = 0)
# Convert event numbers back into text
number.to.event <- as.list(setNames(c('none', 'visit', 'buy'),
as.character(c(0, 1, 2))))
# Save row names and column names
out <- matrix(number.to.event[as.character(temp)], nrow = 3,
dimnames = dimnames(temp))
> out
1 2 3
A "visit" "buy" "none"
B "visit" "none" "buy"
C "none" "visit" "none"