我的数据挖掘问题是使用exixting web数据的p =下一个网页预测。为此,我有一组频繁的序列,这些序列是在R中使用cspade算法获得的。现在我不确定如何挖掘关联集以便我可以预测下一页?有人可以帮忙吗?
频繁的顺序如下:
items support 1 [{1}] 0.1640261 2 [{10}] 0.05112657 3 [{11}] 0.05818949 4 [{12}] 0.11333700 5 [{13}] 0.07773954 6 [{14}] 0.12036354 7 [{15}] 0.02950037 8 [{17}] 0.01111922 9 [{2}] 0.17708912 10 [{3}] 0.12320245 11 [{4}] 0.12297109 12 [{5}] 0.02524403 13 [{6}] 0.21933426 14 [{7}] 0.08134223 15 [{8}] 0.09659857 16 [{9}] 0.09111978 17 [{6}, {9}] 0.01086563 18 [{9}, {9}] 0.04410508 19 [{9}, {9}, {9}] 0.02321639 20 [{9}, {9}, {9}, {9}] 0.01316606 21 [{8}, {8}] 0.06783368 22 [{8}, {8}, {8}] 0.05253996 23 [{8}, {8}, {8}, {8}] 0.04431926 24 [{8}, {8}, {8}, {8}, {8}] 0.03771097 25[{8}, {8}, {8}, {8}, {8}, {8}] 0.02928619 26 [ {8}, {8}, {8}, {8}, {8}, {8}, {8} ] 0.02185351
答案 0 :(得分:0)
不幸的是,这些数据并不多。
忘记1个项目集。
选择最有信心的预测规则。匹配n-1项,预测最后一项。
有效的是,您的数据中有三个预测: