首先我们有一个交易数据,我们可以使用内置的数据集。
require(arules)
## Can use built-in example dataset
require(datasets)
data(Groceries)
groceries <- as ( "transactions") # convert to 'transactions' class
summary(groceries)
输出是:
most frequent items:
whole milk other vegetables rolls/buns soda yogurt (Other)
2513 1903 1809 1715 1372 34055
但是我们还有另一个数据表,我们希望将数据用于标记:
itemnum <- c(1,2,3,4,5)
ProductName_ <- factor(c("whole milk", "other vegetables", "rolls/buns", "soda", "yogurt"))
ProductNames <- data.frame(itemnum, ProductName_)
如何使用第二个中的itemnum替换第一个表上的产品说明?
所以当我跑:
summary(groceries)
输出结果为:
most frequent items:
1 2 3 4 5 (Other)
2513 1903 1809 1715 1372 34055
答案 0 :(得分:0)
您可以在调用summary
library(arules)
library(datasets)
data(Groceries)
summary(Groceries)
# transactions as itemMatrix in sparse format with
# 9835 rows (elements/itemsets/transactions) and
# 169 columns (items) and a density of 0.02609146
#
# most frequent items:
# whole milk other vegetables rolls/buns soda yogurt
# 2513 1903 1809 1715 1372
# (Other)
# 34055
#
# element (itemset/transaction) length distribution:
# sizes
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# 2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46 29 14 14
# 20 21 22 23 24 26 27 28 29 32
# 9 11 4 6 1 1 1 1 3 1
#
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.000 2.000 3.000 4.409 6.000 32.000
#
# includes extended item information - examples:
# labels level2 level1
# 1 frankfurter sausage meet and sausage
# 2 sausage sausage meet and sausage
# 3 liver loaf sausage meet and sausage
itemnum <- c(1,2,3,4,5)
ProductName_ <- factor(c("whole milk", "other vegetables", "rolls/buns", "soda", "yogurt"))
ProductNames <- data.frame(itemnum, ProductName_)
#change values in Groceries@itemInfo$labels check out plyr::mapvalues as well
Groceries@itemInfo$labels <- ProductNames$itemnum[match(Groceries@itemInfo$labels,ProductNames$ProductName_)]
summary(Groceries)
# transactions as itemMatrix in sparse format with
# 9835 rows (elements/itemsets/transactions) and
# 169 columns (items) and a density of 0.02609146
#
# most frequent items:
# 1 2 3 4 5 (Other)
# 2513 1903 1809 1715 1372 34055
#
# element (itemset/transaction) length distribution:
# sizes
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# 2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46 29 14 14
# 20 21 22 23 24 26 27 28 29 32
# 9 11 4 6 1 1 1 1 3 1
#
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.000 2.000 3.000 4.409 6.000 32.000
#
# includes extended item information - examples:
# labels level2 level1
# 1 NA sausage meet and sausage
# 2 NA sausage meet and sausage
# 3 NA sausage meet and sausage