在R中转换数据

时间:2015-06-03 00:46:33

标签: r

我已将数据集上传到R.数据集有2列

user_id merchant_id
514729  14852,16695
1240327 23590
7457    211
359027  2483
463149  5802
514730  5460,1896
41953   7183,147105
927805  304,3909,4151,32,3,39171

正如您所看到的,某些用户ID与多个商家相关联。我要做的是以这样的方式转换数据 我有以下架构

User Id MerchantId1 MerchantId2 MerchantId3 MerchantId 4
123445  0           1           0           1
123453  1           0           0           0

基本上我想根据user_id是否有merchant_id来创建一个基于user或id的用户ID和商家ID矩阵。

关于如何实现这一目标的任何建议/帮助?

我希望用它来构建推荐系统。任何帮助都会很棒。

2 个答案:

答案 0 :(得分:1)

我的解释是你正在追求以下内容:

library(splitstackshape)
cSplit_e(mydf, "merchant_id", ",", type = "character", fill = 0)
##   user_id merchant_id merchant_id_147105 merchant_id_14852 merchant_id_16695
## 1  514729 14852,16695                  0                 1                 1
## 2 1240327       23590                  0                 0                 0
## 3    7457         211                  0                 0                 0
## 4  359027        2483                  0                 0                 0
## 5  463149        5802                  0                 0                 0
## 6  514730   5460,1896                  0                 0                 0
## 7   41953 7183,147105                  1                 0                 0
##   merchant_id_1896 merchant_id_211 merchant_id_23590 merchant_id_2483
## 1                0               0                 0                0
## 2                0               0                 1                0
## 3                0               1                 0                0
## 4                0               0                 0                1
## 5                0               0                 0                0
## 6                1               0                 0                0
## 7                0               0                 0                0
##   merchant_id_5460 merchant_id_5802 merchant_id_7183
## 1                0                0                0
## 2                0                0                0
## 3                0                0                0
## 4                0                0                0
## 5                0                1                0
## 6                1                0                0
## 7                0                0                1

答案 1 :(得分:1)

您还可以使用 if ( isset($_POST['submit'])) { selected_val = $_POST['usertype']; // Storing Selected Value In Variable $e_mail = filter_var($_POST['email'], FILTER_VALIDATE_EMAIL); $e_mail = $e_mail . " - is a - " .$selected_val . " ," . "\n"; file_put_contents('email-list.txt', $e_mail, FILE_APPEND | LOCK_EX); }

中的mtabulate
qdapTools

数据

library(qdapTools)
cbind(df1, mtabulate(strsplit(df1$merchant_id, ','))) 
#   user_id              merchant_id 147105 14852 16695 1896 211 23590 2483 3 304
#1  514729              14852,16695      0     1     1    0   0     0    0 0   0
#2 1240327                    23590      0     0     0    0   0     1    0 0   0
#3    7457                      211      0     0     0    0   1     0    0 0   0
#4  359027                     2483      0     0     0    0   0     0    1 0   0
#5  463149                     5802      0     0     0    0   0     0    0 0   0
#6  514730                5460,1896      0     0     0    1   0     0    0 0   0
#7   41953              7183,147105      1     0     0    0   0     0    0 0   0
#8  927805 304,3909,4151,32,3,39171      0     0     0    0   0     0    0 1   1
#  32 3909 39171 4151 5460 5802 7183
#1  0    0     0    0    0    0    0
#2  0    0     0    0    0    0    0
#3  0    0     0    0    0    0    0
#4  0    0     0    0    0    0    0
#5  0    0     0    0    0    1    0
#6  0    0     0    0    1    0    0
#7  0    0     0    0    0    0    1
#8  1    1     1    1    0    0    0