我无法理解如何将列表转换为事务以供apriori算法进一步处理。我有一个合成的例子,它是有效的,而且是真实的(好吧,Foodmart数据库的一个子集),它不起作用;它们在系统级别上看起来和我一样。请帮我将列表转换为事务对象。
> version
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 0.2
year 2013
month 09
day 25
svn rev 63987
language R
version.string R version 3.0.2 (2013-09-25)
nickname Frisbee Sailing
> a_list <- list(
c("a","b","c"),
c("a","b"),
c("a","b","d"),
c("c","e"),
c("c","e"),
c("a","b","d","e")
)
> a_trans <- as(a_list,"transactions")
> summary(a_trans)
transactions as itemMatrix in sparse format with
6 rows (elements/itemsets/transactions) and
5 columns (items) and a density of 0.5333333
... and so on ...
2 b
3 c
> a_rules <- apriori(a_trans)
parameter specification:
confidence minval smax arem aval originalSupport support minlen maxlen target ext
... and so on ...
writing ... [17 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
> b_list <- list(
c("PigTail Frozen Pepperoni Pizza","Bird Call Childrens Cold Remedy","Steady Silky Smooth Hair Conditioner","CDR Regular Coffee"),
c("Horatio Graham Crackers","Excellent Apple Drink","Blue Medal Small Eggs","Cormorant Copper Cleaner","High Quality Copper Cleaner","Fast Apple Fruit Roll"),
c("Toucan Canned Mixed Fruit","Landslide Salt","Gorilla Sour Cream","Hermanos Firm Tofu"),
c("Swell Canned Mixed Fruit","Washington Diet Soda","Super Apple Jam","Plato Strawberry Preserves","Steady Whitening Toothpast","Steady Whitening Toothpast","Better Beef Soup","Hermanos Squash","Carrington Frozen Cheese Pizza","Fort West Fondue Mix","Best Choice Mini Donuts","Cormorant Copper Pot Scrubber","Ebony Cantelope","Denny D-Size Batteries","Akron Eyeglass Screwdriver"),
c("Big Time Ice Cream Sandwich","Musial Mints","Portsmouth Imported Beer","CDR Vegetable Oil","Just Right Rice Soup","Carrington Frozen Peas","High Quality 100 Watt Lightbulb","Fort West Dried Dates"),
c("Consolidated Tartar Control Toothpaste","Plato Tomato Sauce","Quick Seasoned Hamburger")
)
> b_trans <- as(b_list,"transactions")
Error in asMethod(object) :
can not coerce list with transactions with duplicated items
> summary(b_trans)
Error in summary(b_trans) :
error in evaluating the argument 'object' in selecting a method for function 'summary': Error: object 'b_trans' not found
> duplicated(a_list)
[1] FALSE FALSE FALSE FALSE TRUE FALSE
> duplicated(b_list)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
为什么这个神话般的(WTF)事情会发生?
答案 0 :(得分:3)
joran和DWin提到:
它看起来如何。如果我将第二个“b”添加到a_list2的第一个向量
中> a_list2 <- list(
c("a","b","b","c"),
c("a","b"),
c("a","b","d"),
c("c","e"),
c("c","e"),
c("a","b","d","e")
)
在以下尝试转换数据时我得到了错误
> a_trans2 <- as(a_list2,"transaction")
Error in as(a_list2, "transaction") :
no method or default for coercing “list” to “transaction”
似乎b_list在第四个载体中有两次提到的“Steady Whitening Toothpast”。手动删除此复制解决了这个问题。
> b_trans2 <- as(b_list2,"transactions")
> summary(b_trans2)
transactions as itemMatrix in sparse format with
6 rows (elements/itemsets/transactions) and
... and so on ...
2 Best Choice Mini Donuts
3 Better Beef Soup
在谈到实际数据处理的解决方案时,以下代码不会产生任何错误。
aggrData <- split(selData$product_name,selData$transaction_id)
listData <- list()
for (i in 1:length(aggrData)) {
listData[[i]] <- as.character(aggrData[[i]][!duplicated(aggrData[[i]])])
}
trnsData <- as(listData,"transactions")
但是,以下行或其他参数的尝试都没有规则。
> rules <- apriori(trnsData)
parameter specification:
... and so on ...
writing ... [0 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
然而,这是一个完全不同的故事。