R根据分组结果将产品类型分配给用户

时间:2018-08-14 01:31:26

标签: r dplyr

数据集是基于在线购买信息的在线市场示例(ebay,亚马逊)。

user_id, product_code, bought_date, time_spent, store_id, product_type, refurbished, unqiue_visit_id
001, e.12, 20120102, 104, 101, computer, yes, 1010
002, e.24, 20120201, 100, 101, infant-dress, no, 2001
003, s.32, 20130302, 230, 101, shoes, no, 2121
004, y.23, 20130404, 212, 103, computer, yes, 2422
005, s.43, 20130803, 104, 101, laptop, yes, 2342
001, a.12, 20120202, 104, 101, computer, yes, 1011
002, b.24, 20120201, 100, 101, infant-dress, no, 2001
003, c.32, 20130302, 230, 101, shoes, no, 2122
004, e.23, 20130404, 212, 103, computer, yes, 2424
005, f.43, 20130803, 104, 101, laptop, yes, 2340
001, g.12, 20120102, 104, 101, computer, yes, 1013
002, h.24, 20120201, 100, 101, infant-dress, no, 2031
003, l.32, 20130302, 230, 101, shoes, no, 2000
004, m.23, 20130404, 212, 103, computer, yes, 1422
005, d.43, 20130803, 104, 101, laptop, yes, 1142
001, d.12, 20120102, 104, 101, desk, yes, 1110
002, f.24, 20120201, 100, 101, glass, no, 1111
003, n.32, 20130302, 230, 101, liquid, no, 2021
004, t.23, 20130404, 212, 103, liquid, yes, 22
005, u.43, 20130803, 104, 101, dress, yes, 2942
001, d.12, 20120102, 104, 101, desk, yes, 1910
002, f.24, 20120201, 100, 101, glass, no, 2901
003, n.32, 20130302, 230, 101, liquid, no, 2921
004, t.23, 20130404, 212, 103, liquid, yes, 2922
005, u.43, 20130803, 104, 101, dress, yes, 2942
001, kk.12, 20120103, 105, 101, desk, yes, 410
003, n.32, 20130303, 230, 101, liquid, no, 2621

unique_visit_id使用user_idproduct_codestore_idproduct_typebought_date创建

目标是首先通过将user_idproduct_type分组来获得唯一身份访问次数

test.visits <- test %>% 
  group_by(user_id,product_type) %>% 
  summarize(visit_count = n_distinct(unqiue_visit_id)) %>% 
  arrange(desc(visit_count),user_id)


   user_id product_type    visit_count
     <int> <fct>           <int>
 1       1 " computer"         3
 2       1 " desk"             3
 3       2 " infant-dress"     3
 4       3 " liquid"           3
 5       3 " shoes"            3
 6       4 " computer"         3
 7       5 " laptop"           3
 8       2 " glass"            2
 9       4 " liquid"           2
10       5 " dress"            2

现在,我想根据最高访问次数将产品类型分配给用户。如果按新近度(bought_daterefurbish来回访问,则store id的最后一个值。

example:
     1       1 " computer"         3
     2       1 " desk"             3

打领带的基本条件。最高访问次数product_type被分配给组内的用户

0 个答案:

没有答案