如何使用python apriori解决房屋投票二进制数据集?

时间:2019-04-23 15:32:12

标签: python apriori

我想使用apriori分析房屋投票84数据集。共有17栏,第一栏是“一方”这是两类数据。其余列是二进制数据集。如何在python中应用apriori解析呢? minsup = 0.3,minconfidence = 0.9

[在此处输入图片描述] [1]

这些是我的代码:输出看起来很丑,而且不合理。

import matplotlib.pyplot as plt
from sklearn import datasets
import pandas as pd
import numpy as np
import sys
import os  

from apyori import apriori 
from mlxtend.frequent_patterns import apriori
from efficient_apriori import apriori
from mlxtend.frequent_patterns import association_rules
from mlxtend.preprocessing import TransactionEncoder

df = pd.read_table("house-votes-84.data", sep=",", header=None, 
na_values="?")
col_names = ['party', 'infants', 'water', 'budget', 'physician', 
'salvador','religious', 'satellite', 'aid', 'missile', 'immigration', 
'synfuels','education', 'superfund', 'crime', 'duty_free_exports', 
'eaa_rsa']
df = df.fillna(0)
df.columns = col_names
df.shape
print(df.head())

df = df.replace({'y': 1, 'n': -1, '?': 0})
print(df.head()) 

records = []  
for i in range(0, 435):  
records.append([str(df.values[i,j]) for j in range(0, 16)])

association_rules = apriori(records, min_support=0.3, min_confidence=0.9)  
association_results = list(association_rules) 
print(len(association_rules)) 
print(association_rules[0])  `enter code here

输出:

{1:{('-1',):433,('0',):154,('1',):434,('democrat',):267,('republican',) :168},2:2:{('-1','0'):152,('-1','1'):433,('-1','democrat'):266,('-1 ','republican'):167,('0','1'):153,('1','democrat'):267,('1','republican'):167},3:{( '-1','0','1'):152,('-1','1','民主'):266,('-1','1','共和党'):167} }

1 个答案:

答案 0 :(得分:0)

apriori中的efficient_apriori函数返回一个元组(itemsets, rules)。要使用efficient_apriori,您可以执行以下操作:

from efficient_apriori import apriori
itemsets, rules = apriori(records, min_support=0.3, min_confidence=0.9)
for rule in rules:
    print(rule)

有关更多信息,请参阅此example