我试图找出一种方法来获取R中唯一用户的产品组合列表。这是[Generate matrix of unique user-item cross-product combinations
的后续问题df <- data.frame(Products=c('Product a', 'Product b', 'Product a',
'Product c', 'Product b', 'Product c', 'Product d'),
Users=c('user1', 'user1', 'user2', 'user1',
'user2','user3', 'user1'))
df的输出是:
Products Users
1 Product a user1
2 Product b user1
3 Product a user2
4 Product c user1
5 Product b user2
6 Product c user3
7 Product d user1
我正在寻找的输出将是所有三种产品组合:
Product a/Product b/Product c - 3
Product a/Product b/Product d - 2
Product b/Product c/Product d - 3
...
再次感谢您的帮助。
答案 0 :(得分:2)
看起来您希望逻辑或处理作为用户与每个产品集之间的关系。换句话说,您想要计算有多少唯一身份用户在该集合中拥有任何产品。这是一种方法:
df <- data.frame(Products=c('Product a','Product b','Product a','Product c','Product b','Product c','Product d'),Users=c('user1','user1','user2','user1','user2','user3','user1'));
comb <- combn(unique(df$Products),3);
data.frame(comb=apply(comb,2,function(x) paste(levels(comb)[x],collapse='/')),num=apply(comb,2,function(x) length(unique(df$Users[as.integer(df$Products)%in%x]))));
## comb num
## 1 Product a/Product b/Product c 3
## 2 Product a/Product b/Product d 2
## 3 Product a/Product c/Product d 3
## 4 Product b/Product c/Product d 3
编辑:Logical-AND比较棘手,因为我们需要为每个用户测试每个产品的存在。我想我使用aggregate()
和match()
找到了一个很好的解决方案:
data.frame(comb=apply(comb,2,function(x) paste(levels(comb)[x],collapse='/')),num=apply(comb,2,function(x) sum(aggregate(Products~Users,df,function(y) !any(is.na(match(x,as.integer(y)))))$Products)));
## comb num
## 1 Product a/Product b/Product c 1
## 2 Product a/Product b/Product d 1
## 3 Product a/Product c/Product d 1
## 4 Product b/Product c/Product d 1