我有以下销售数据:
+------------+------+-------+
| Receipt ID | Item | Value |
+------------+------+-------+
| 1 | a | 2 |
| 1 | b | 3 |
| 1 | c | 2 |
| 1 | k | 4 |
| 2 | a | 2 |
| 2 | b | 5 |
| 2 | d | 6 |
| 2 | k | 7 |
| 3 | a | 8 |
| 3 | k | 1 |
| 3 | c | 2 |
| 3 | q | 3 |
| 4 | k | 4 |
| 4 | a | 5 |
| 5 | b | 6 |
| 5 | a | 7 |
| 6 | a | 8 |
| 6 | b | 3 |
| 6 | c | 4 |
+------------+------+-------+
使用APriori算法,我将规则修改为不同的列:
例如,我得到如下输出,我修剪支持,置信度,提升值。我只考虑将不同列映射到目标项目,项目1,项目({Item1,Item2} - > {目标项})
输出如下:
+-------------+-------+-------+
| Target Item | Item1 | Item2 |
+-------------+-------+-------+
| a | b | |
| a | b | c |
| a | k | |
+-------------+-------+-------+
我希望计算具有规则组合的所有收据,并仅在这些收据中识别目标商品销售价值,并在组合收据中识别第1项和第2项的组合销售价值:
输出应该如下所示(我不需要下面的收据ID)
+-------------+-------+-------+--------------+----------------------+------------------------------+
| Target Item | Item1 | Item2 | Receipt ID's | Value of Target Item | Remaining value(Item1+item2) |
+-------------+-------+-------+--------------+----------------------+------------------------------+
| a | b | | 1,2,5,6 | 2+2+7+8 | 3+5+6+3 |
| a | b | c | 1,6 | 2 | (3+3) + (2+4) |
| a | k | | 1,2,3,4 | 2+2+8+5 | 4+7+1+4 |
+-------------+-------+-------+--------------+----------------------+------------------------------+
复制Apriori:
library(arules)
Data <- data.frame(
Receipt_ID = c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,6,6,6),
item = c('a','b','c','k','a','b','d','k','a','k','c','q','k', 'a','b','a','a', 'b', 'c'
)
,
value = c(2,3,2,4,2,5,6,7,8,1,2,3,4,5,6,7,8,3,4
)
)
write.table(Data,"item.csv",sep=',',row.names = F)
data_frame = read.transactions(
file = "item.csv",
format = "single",
sep = ",",
cols = c("Receipt_ID","item"),
rm.duplicates = T
)
rules_apriori <- apriori(data_frame)
rules_apriori
rules_tab <- as(rules_apriori, "data.frame")
rules_tab
out <- strsplit(as.character(rules_tab$rules),'=>')
rules_tab$rhs <- do.call(rbind, out)[,2]
rules_tab$lhs <- do.call(rbind, out)[,1]
rules_tab$rhs <- gsub("\\{", "", rules_tab$rhs)
rules_tab$rhs <- gsub("}", "", rules_tab$rhs)
rules_tab$lhs = gsub("}", "", rules_tab$lhs)
rules_tab$lhs = gsub("\\{", "", rules_tab$lhs)
rules_final <- data.frame (target_item = character(),item_combination = character() )
rules_final <- cbind(target_item = rules_tab$rhs,item_Combination = rules_tab$lhs)
rules_final