将R data.frame列转换为Arules事务

时间:2017-06-18 19:55:01

标签: r transactions data-conversion apriori arules

概述:

我需要将以下 data.frame 列(t $ Tags)转换为 Arules 事务:

  1. IOS,按钮swift3,编译器错误,空
  2. C#,通过按引用,不安全的球
  3. 弹簧,行家,弹簧-MVC,弹簧的安全性,弹簧java的配置
  4. 机器人,机器人片段,机器人-fragmentmanager
  5. 阶,阶的集合
  6. 蟒-2.7,蟒-3.X,matplotlib,情节
  7. 由于这些数据已经是篮子格式,并且遵循Arules文档中的示例3(https://cran.r-project.org/web/packages/arules/arules.pdf,第90页),我通过执行以下操作转换列:

    ######################################################################################################
    #Option 1 - converting data.frame as described in the documentation (page 90)
    ######################################################################################################
    ## example 3: creating transactions from data.frame
    a_df <- data.frame(
      Tags = as.factor(c("scala",
                          "ios, button, swift3, compiler-errors, null",
                          "c#, pass-by-reference, unsafe-pointers",
                          "spring, maven, spring-mvc, spring-security, spring-java-config",
                          "android, android-fragments, android-fragmentmanager",
                          "scala, scala-collections",
                          "python-2.7, python-3.x, matplotlib, plot"))
      )
    ## coerce
    trans3 <- as(a_df, "transactions")
    rules <- apriori(trans3, parameter = list(sup = 0.1, conf = 0.5, target="rules",minlen=1))
    rules_output <- as(rules,"data.frame")
    ## Result: 0 rules
    ######################################################################################################
    # Option 2 - reading from a CSV file, which contains exactly the same data
    # above without the header and the quotes
    ######################################################################################################
    file = "Test.csv"
    trans3 = read.transactions(file = file, sep = ",", format = c("basket"))
    rules <- apriori(trans3, parameter = list(sup = 0.1, conf = 0.5, target="rules",minlen=1))
    rules_output <- as(rules,"data.frame")
    ## Result: 198 rules
    

    选项1 - 结果= 0 规则

    选项2 - 结果= 198 规则

    问题:

    在我当前的任务和环境中,我无法将 data.frame 列保存到格式化的平面文件(CSV或任何其他),然后使用 read.transactions重新读取(将选项1翻译成选项2)。 如何以正确的格式转换 data.frame 列以正确使用 Arules apriori算法?

1 个答案:

答案 0 :(得分:2)

查看;中的示例。您需要一个包含项目向量(项目标签)的列表,而不是? transactions

data.frame