使用ID3算法预测Accord.Net框架

时间:2016-06-12 05:09:13

标签: c# machine-learning classification prediction id3

我正在使用以下代码在更改商品的价格,折扣和广告时查找给定商品的预期销售额。这是使用Accord.Net库使用ID3算法实现的。

namespace PnredictionSales
{
public partial class WebForm1 : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        DataTable data = new DataTable("Sales prediction Example");

        data.Columns.Add("RowKey");
        data.Columns.Add("Brand");
        data.Columns.Add("PriceRange");
        data.Columns.Add("Discount");
        data.Columns.Add("Advertisement");
        data.Columns.Add("ExpSales");

        //  data.Columns.Add("Wind");
        //  data.Columns.Add("PlayTennis");

        data.Rows.Add("D1", "Highland", "R1", "yes", "No", "B");
        data.Rows.Add("D2", "Highland", "R1", "yes", "yes", "C");
        data.Rows.Add("D3", "Anchor", "R1", "yes", "No", "B");
        data.Rows.Add("D4", "Flora", "R2", "yes", "No", "B");
        data.Rows.Add("D5", "Flora", "R3", "No", "No", "A");
        data.Rows.Add("D6", "Flora", "R3", "No", "yes", "A");
        data.Rows.Add("D7", "Anchor", "R3", "No", "yes", "A");
        data.Rows.Add("D8", "Highland", "R2", "yes", "No", "B");
        data.Rows.Add("D9", "Highland", "R3", "No", "No", "A");
        data.Rows.Add("D10", "Flora", "R2", "No", "No", "B");
        data.Rows.Add("D11", "Highland", "R2", "No", "yes", "B");
        data.Rows.Add("D12", "Anchor", "R2", "yes", "yes", "A");
        data.Rows.Add("D13", "Anchor", "R1", "No", "No", "B");
        data.Rows.Add("D14", "Flora", "R2", "yes", "yes", "A");

        Codification codebook = new Codification(data);

        DecisionVariable[] attributes =
        {
            new DecisionVariable("Brand", 3),  new DecisionVariable("PriceRange",3),
            new DecisionVariable("Discount",2),new DecisionVariable("Advertisement",2) 
        };

        int classCount=3; // 2 possible output values for playing tennis: yes or no

        DecisionTree tree = new DecisionTree(attributes, classCount);

        // Create a new instance of the ID3 algorithm
        ID3Learning id3learning = new ID3Learning(tree);

        // Translate our training data into integer symbols using our codebook:
        DataTable symbols = codebook.Apply(data);
        int[][] inputs = symbols.ToIntArray("Brand", "PriceRange","Discount","Advertisement");
        int[] outputs = symbols.ToIntArray("ExpSales").GetColumn(0);

        // Learn the training instances!

        id3learning.Run(inputs, outputs);
        int[] query = codebook.Translate("Flora","R1","yes","No");

        int output = tree.Compute(query.ToDouble());

        string answer = codebook.Translate("ExpSales", output); // answer will be "No".
        Label1.Text = answer;
    }
}

我的问题是:

当我将任何字符串值放到int[] query = codebook.Translate("fff","eee","ffg","qqq");时,它会给我一个输出。我想这是什么原因?我的方法有误吗? 另外,我想知道在数据表中组织数据的最低要求是什么,以获得准确的结果。

1 个答案:

答案 0 :(得分:2)

我尝试运行你的代码 - 但是我得到了一个异常而不是任何输出。看看它我认为问题在于,当您创建Codification时,您没有指定要包含的列,因此它包含RowKey列,然后没有任何排列。而是通过以下方式创建编码:

Codification codebook = new Codification(data, "Brand", "PriceRange", "Discount", "Advertisement", "ExpSales");

然后它似乎有效。

当我再次尝试int[] query = codebook.Translate("fff","eee","ffg","qqq");的示例时,我只得到一个异常(因为这些值不存在) - 所以我认为你必须有一个异常处理程序来隐藏这些问题。

就获得准确结果的最小数据而言 - 它实际上取决于数据的复杂程度以及数据包含的噪点。您需要针对一组数据训练模型,然后根据完全不同的数据集测试其准确性,以便衡量其是否正常工作。

相关问题