关于R Apriori的lhs的多个标准

时间:2017-10-12 09:16:04

标签: r apriori

我试图让R apriori算法允许我同时在lhs上指定我想要的多个属性。

rules <- apriori(Data, parameter=list(supp = 0.0001, conf = 0.001, minlen = 2), appearance = list(lhs = c("DiagnoseTekst=Acuut hartfalen"), default="rhs"))

以上一行会过滤lhs仅针对某个DiagnoseTekst(我的数据中的列是&#39; MedicatieTekst&#39;,&#39; Geslacht&#39;&#39; DiagnoseTekst&#39;) 但是,我希望它可以过滤两个DiagnoseTekst和Geslacht。当我放入

rules <- apriori(Data, parameter=list(supp = 0.0001, conf = 0.001, minlen = 2), appearance = list(lhs = c("DiagnoseTekst=Acuut hartfalen", "Geslacht=M"), default="rhs"))

我只使用DiagnoseTekst获得一些规则,有些只使用Geslacht。 (在这种情况下,大多数情况应该具有两个属性)。有没有办法过滤搜索或结果,以便能够为lhs列指定多个条件?

要明确的完整代码:

Data <- as(data, "transactions")

str(Data)
rules <- apriori(Data, parameter=list(supp = 0.0001, conf = 0.001, minlen = 2), appearance = list(lhs = c("DiagnoseTekst=Acuut hartfalen"), default="rhs"))
top.conf <- sort(rules, decreasing = TRUE, na.last = NA, by = c("confidence","lift"))
set <- inspect(head(subset(top.conf), 30))

输出示例:

> set <- inspect(head(subset(top.conf), 30))
     lhs                                rhs                                                support     confidence  lift      count
[1]  {DiagnoseTekst=Acuut hartfalen} => {Geslacht=V}                                       0.066477566 0.525500378 1.1539592 30561
[2]  {DiagnoseTekst=Acuut hartfalen} => {Geslacht=M}                                       0.060025798 0.474499622 0.8712635 27595
[3]  {DiagnoseTekst=Acuut hartfalen} => {MedicatieTekst=FUROSEMIDE}                        0.017917467 0.141636289 2.9290550  8237
[4]  {DiagnoseTekst=Acuut hartfalen} => {MedicatieTekst=METOPROLOL}                        0.006279923 0.049642341 0.9877311  2887
[5]  {DiagnoseTekst=Acuut hartfalen} => {MedicatieTekst=PARACETAMOL}                       0.005201003 0.041113557 0.6085413  2391

1 个答案:

答案 0 :(得分:0)

如果我理解正确,那么您需要确保必须在规则的LHS中出现两个指定的项目。外观只能限制规则中可能出现的项目,但并不是所有项目都必须存在。但是,您可以使用subset过滤来解决此问题。以下代码使用%ain%查找规则的LHS中具有“age = Senior”和“sex = Male”项的所有规则(全部见;请参阅? "%ain%"

> library("arules")
> data(Adult)

> rules <- apriori(Adult)
> rules
set of 6137 rules 

> rules <- subset(rules, lhs %ain% c("age=Senior","sex=Male"))
> rules
set of 167 rules 

> inspect(head(rules, by = "lift", n  = 3))
    lhs                                    rhs                    support confidence lift count
[1] {age=Senior,                                                                               
     marital-status=Married-civ-spouse,                                                        
     sex=Male,                                                                                 
     capital-gain=None,                                                                        
     native-country=United-States}      => {relationship=Husband}    0.12          1  2.5  5687
[2] {age=Senior,                                                                               
     marital-status=Married-civ-spouse,                                                        
     race=White,                                                                               
     sex=Male,                                                                                 
     capital-gain=None,                                                                        
     native-country=United-States}      => {relationship=Husband}    0.11          1  2.5  5293
[3] {age=Senior,                                                                               
     marital-status=Married-civ-spouse,                                                        
     sex=Male,                                                                                 
     capital-gain=None,                                                                        
     capital-loss=None,                                                                        
     native-country=United-States}      => {relationship=Husband}    0.11          1  2.5  5238