插入符包分类器没有响应

时间:2016-04-15 15:28:01

标签: r classification r-caret naivebayes

我正在尝试使用插入符号包的分类器对列车数据建模,但它很长时间没有响应(我等了2个小时)。另一方面,它适用于其他数据集。

以下是我的火车数据的链接:http://www.htmldersleri.org/train.csv(众所周知的Reuters-21570数据集)

我使用的命令是:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Data.OleDb;
using System.Data.SqlClient;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace AccessToSQL
{
public partial class Form1 : Form
{
    const string databaselocation = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Database1.accdb;Persist Security Info = False;";
    List<string> tables = new List<string>();
    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {
        GetTableNames();
        const string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Database1.accdb;Persist Security Info = False;";
        const string connectionStringDest = "Data Source = TO\\SQLEXPRESS;Initial Catalog=Testing;Integrated Security=SSPI;";
        using (var sourceConnection = new OleDbConnection(connectionString))
        {
            sourceConnection.Open();
            using (var destinationConnection = new SqlConnection(connectionStringDest))
            {
                destinationConnection.Open();
                foreach (string tbl in tables)
                {
                    var commandSourceData = new OleDbCommand("Select * from "+tbl, sourceConnection);
                    var reader = commandSourceData.ExecuteReader();
                    using (var bulkCopy = new SqlBulkCopy(destinationConnection))
                    {
                        bulkCopy.DestinationTableName = "dbo."+tbl;
                        try { bulkCopy.WriteToServer(reader); }
                        catch (Exception ex) { MessageBox.Show(ex.Message); }
                        finally { reader.Close(); }
                    }
                }
            }
        }
    }
    public List<string> GetTableNames()
    {
        try {
            using (OleDbConnection con = new OleDbConnection(databaselocation))
            {
                con.Open();
                //DataTable schema = con.GetSchema("Columns");
                //foreach (DataRow row in schema.Rows)
                //{
                //    tables.Add(row.Field<string>("TABLE_NAME"));
                //}
                foreach (DataRow r in con.GetSchema("Tables").Select("TABLE_TYPE = 'TABLE'"))
                {
                    tables.Add(r["TABLE_NAME"].ToString());
                }
                return tables;
            }
        }
        catch (Exception ex) { MessageBox.Show(ex.Message); }
        return tables;
    }
}
}

注意:对于任何其他方法(例如:svm,朴素贝叶斯等),无论如何都会被卡住。

注2:对于e1071包,naiveBayes分类器有效,但准确率为0,08%!

谁能告诉我可能是什么问题?提前谢谢。

1 个答案:

答案 0 :(得分:0)

这似乎是多类分类问题。我不确定caret是否支持这一点。但是,我可以向您展示如何使用mlr

执行相同的操作
library(mlr)
x <- read.csv("http://www.htmldersleri.org/train.csv")
tsk <- makeClassifTask(data = x, target = 'class')
#Assess the performane with 10-fold cross-validation
crossval('classif.knn', tsk)

如果您想知道哪些学习者已集成在支持此类任务的mlr中,请键入     listLearners(tsk)