所以我有一个名为 tr_table 的项目集的示例,如下所示:
+---------+-----------+
| tr_kode | item|
+---------+-----------+
| T1 | 1 |
| T1 | 2 |
| T1 | 2 |
| T1 | 5 |
| T2 | 1 |
| T2 | 3 |
| T2 | 4 |
| T2 | 5 |
| T2 | 6 |
| T3 | 1 |
| T3 | 2 |
| T4 | 4 |
| T4 | 2 |
| T4 | 6 |
| T5 | 6 |
| T5 | 5 |
| T5 | 4 |
| T6 | 3 |
| T6 | 6 |
| T6 | 2 |
| T7 | 2 |
| T7 | 1 |
| T7 | 7 |
+---------+-----------+
然后,我将最低支持设置为20%,并频繁查看名为 freq_item 的表的项目。此视图包含已排序的选定频繁项。
+------+-----------+
| item | suppCount |
+------+-----------+
| 2 | 6 |
| 1 | 4 |
| 6 | 4 |
| 4 | 3 |
| 5 | 3 |
| 3 | 2 |
+------+-----------+
此后,我已经获得了已选择并排序的交易表,该交易表称为 selected_tr
+------+------+
| tid | item |
+------+------+
| T1 | 2 |
| T1 | 1 |
| T1 | 5 |
| T2 | 1 |
| T2 | 6 |
| T2 | 4 |
| T2 | 5 |
| T2 | 3 |
| T3 | 2 |
| T3 | 1 |
| T4 | 2 |
| T4 | 6 |
| T4 | 4 |
| T5 | 6 |
| T5 | 4 |
| T5 | 5 |
| T6 | 2 |
| T6 | 6 |
| T6 | 3 |
| T7 | 2 |
| T7 | 1 |
+------+------+
我想问的是,如何从selected_tr构建fp树,然后根据fp-growth算法找到频繁模式。谢谢你。