arules
需要一个交易清单。列表中的每一行都包含一系列产品。并非每笔交易都有相同数量的产品。它听起来像枢轴,但事实并非如此。
可以找到一个示例here
我想要类似的东西
aggregate(dvd , by=list("ID"), FUN=c)
arguments must have same length
这是我的数据
> dvd
ID Item
1 1 Sixth Sense
2 1 LOTR1
3 1 Harry Potter1
4 1 Green Mile
5 1 LOTR2
6 2 Gladiator
7 2 Patriot
8 2 Braveheart
9 3 LOTR1
10 3 LOTR2
11 4 Gladiator
12 4 Patriot
13 4 Sixth Sense
14 5 Gladiator
15 5 Patriot
16 5 Sixth Sense
17 6 Gladiator
18 6 Patriot
19 6 Sixth Sense
20 7 Harry Potter1
21 7 Harry Potter2
22 8 Gladiator
23 8 Patriot
24 9 Gladiator
25 9 Patriot
26 9 Sixth Sense
27 10 Sixth Sense
28 10 LOTR
29 10 Galdiator
30 10 Green Mile
我需要一个看起来像那样的列表
TR1 c("Sixth Sense","LOTR1","Harry Potter1","Green Mile","LOTR2")
TR2 c("Gladiator","Patriot","Braveheart")
TR3 c("LOTR1","LOTR2")
....
答案 0 :(得分:2)
您的aggregate
命令可以正常工作,但您没有正确指定参数。你需要这样的东西:with(DF, aggregate(Item, list(ID), FUN = function(x) c(as.character(x))))
。
或者,您可以使用aggregate
的公式方法:
aggregate(Item ~ ID, DF, c)
# ID Item
# 1 1 Sixth Sense, LOTR1, Harry Potter1, Green Mile, LOTR2
# 2 10 Sixth Sense, LOTR, Galdiator, Green Mile
# 3 2 Gladiator, Patriot, Braveheart
# 4 3 LOTR1, LOTR2
# 5 4 Gladiator, Patriot, Sixth Sense
# 6 5 Gladiator, Patriot, Sixth Sense
# 7 6 Gladiator, Patriot, Sixth Sense
# 8 7 Harry Potter1, Harry Potter2
# 9 8 Gladiator, Patriot
# 10 9 Gladiator, Patriot, Sixth Sense
str(.Last.value)
# 'data.frame': 10 obs. of 2 variables:
# $ ID : chr "1" "10" "2" "3" ...
# $ Item:List of 10
# ..$ 1 : chr "Sixth Sense" "LOTR1" "Harry Potter1" "Green Mile" ...
# ..$ 6 : chr "Sixth Sense" "LOTR" "Galdiator" "Green Mile"
# ..$ 10: chr "Gladiator" "Patriot" "Braveheart"
# ..$ 13: chr "LOTR1" "LOTR2"
# ..$ 15: chr "Gladiator" "Patriot" "Sixth Sense"
# ..$ 18: chr "Gladiator" "Patriot" "Sixth Sense"
# ..$ 21: chr "Gladiator" "Patriot" "Sixth Sense"
# ..$ 24: chr "Harry Potter1" "Harry Potter2"
# ..$ 26: chr "Gladiator" "Patriot"
# ..$ 28: chr "Gladiator" "Patriot" "Sixth Sense"
或者,您可以使用“data.table”包:
library(data.table)
as.data.table(DF)[, list(list(Item)), by = ID]
# ID V1
# 1: 1 Sixth Sense,LOTR1,Harry Potter1,Green Mile,LOTR2
# 2: 2 Gladiator,Patriot,Braveheart
# 3: 3 LOTR1,LOTR2
# 4: 4 Gladiator,Patriot,Sixth Sense
# 5: 5 Gladiator,Patriot,Sixth Sense
# 6: 6 Gladiator,Patriot,Sixth Sense
# 7: 7 Harry Potter1,Harry Potter2
# 8: 8 Gladiator,Patriot
# 9: 9 Gladiator,Patriot,Sixth Sense
# 10: 10 Sixth Sense,LOTR,Galdiator,Green Mile
答案 1 :(得分:2)
arules'read.transactions有一个参数format
可以解决您的问题。这是用法:
read.transactions(file, format = c("basket", "single"), sep = NULL,
cols = NULL, rm.duplicates = FALSE, encoding = "unknown")
请参阅format
参数?您可以使用“basket”或“single”来表示输入数据的格式。您正在尝试将数据转换为“篮子”格式,但您拥有的数据类型已经是“单一” - 每行包含一个带ID的单个项目。只需使用read.transactions并将format
设置为“single”,你就是黄金。
答案 2 :(得分:1)
我认为split
将为您完成这项工作。
DF <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 9L,
10L, 10L, 10L, 10L), Item = c(" Sixth Sense", " LOTR1",
" Harry Potter1", " Green Mile", " LOTR2", " Gladiator",
" Patriot", " Braveheart", " LOTR1", " LOTR2",
" Gladiator", " Patriot", " Sixth Sense", " Gladiator",
" Patriot", " Sixth Sense", " Gladiator", " Patriot",
" Sixth Sense", " Harry Potter1", " Harry Potter2", " Gladiator",
" Patriot", " Gladiator", " Patriot", " Sixth Sense",
" Sixth Sense", " LOTR", " Galdiator", " Green Mile"
)), .Names = c("ID", "Item"), class = "data.frame", row.names = c(NA,
-30L))
DF <- read.csv(textConnection(txt), header = TRUE, stringsAsFactors = FALSE, strip.white = TRUE)
result <- split(DF$Item, DF$ID)
names(result) <- gsub("(.*)", "TR\\1", names(result))
result
## $TR1
## [1] "Sixth Sense" "LOTR1" "Harry Potter1" "Green Mile" "LOTR2"
##
## $TR2
## [1] "Gladiator" "Patriot" "Braveheart"
##
## $TR3
## [1] "LOTR1" "LOTR2"
##
## $TR4
## [1] "Gladiator" "Patriot" "Sixth Sense"
##
## $TR5
## [1] "Gladiator" "Patriot" "Sixth Sense"
##
## $TR6
## [1] "Gladiator" "Patriot" "Sixth Sense"
##
## $TR7
## [1] "Harry Potter1" "Harry Potter2"
##
## $TR8
## [1] "Gladiator" "Patriot"
##
## $TR9
## [1] "Gladiator" "Patriot" "Sixth Sense"
##
## $TR10
## [1] "Sixth Sense" "LOTR" "Galdiator" "Green Mile"