我有一个.csv文件,其中包含以下数据类型:
Day Item
1 12,19,24,31,48,
1 1,19,
1 16,28,32,45,
1 19,36,41,43,44,
1 7,24,27,
1 21,31,33,41,
1 46
1 50
2 12,31,36,48,
2 17,29,47,
2 2,18,20,29,38,39,40,41
2 17,29,47,
而且我无法通过read.transactions正确读取它。
数据集基于每天的多个项目选择(如有必要,每天多次)。例如,第1天的第三个选择返回了16,28,32和45。
这还不够吗?
library(arules)
dataset <- read.transactions("file.csv", format = 'basket')
答案 0 :(得分:0)
我尝试使用您提供的数据创建示例数据
data <- read.table(text="Day Item
1 12,19,24,31,48,
1 1,19,
1 16,28,32,45,
1 19,36,41,43,44,
1 7,24,27,
1 21,31,33,41,
1 46
1 50
2 12,31,36,48,
2 17,29,47,
2 2,18,20,29,38,39,40,41
2 17,29,47",header = T)
data <- as(data[-1], "transactions") ##removing 1st header column for the transactional data
inspect(data)
## apply apriori algorithm ###
rules <- apriori(data, parameter = list(supp = 0.001, conf = 0.80))
### Arrange top 10 rules by lift ####
inspect(rules[1:10])
请尝试此方法,希望对您有帮助