(我正在使用Python和scikit-learn sklearn) 我有一个数据集,其中包含(很多)这种格式的对象:
{"word":"something", "data":[12, 24, 54, 65, 76, 87, 45, 65, 32, 12, 65, 13, 54, 76, 45, 72, 12, 11, 54, 23, 65]}
我为每个单词都有几个,我制作了一个100个单词的样本数据集,每个单词有3000个输入。
我使用一个生成一百个“种子”的脚本制作它,并从每一个中生成3000个输入,方法是使"data"
数组的每个数字随机变化,最大值为±15(用于模拟)现实生活传感器的随机变化)。
从这个数据集中,我将~297000保存到名为“Words”的DB(Mongo DB)中作为训练集。而另一个~3000到另一个DB(称为“测试”)进行测试。
现在,我遇到的问题是我做的3000次测试中只有20次做出预测,准确度得分为1.0。这些结果对我来说听起来不合适,所以我认为我没有以正确的方式做分类器。
我尝试过DecisionTree和KNeighborsClassifier。我假设这两个不是我想要使用的数据类型的正确分类器。我应该使用哪种分类器?实例
修改
我正在粘贴数据库的一部分:(我有300000个,其中每个单词重复1000个)名称是“标签”和“功能”,因为YouTube上的某些视频告诉我他们被称为那哈哈
{"label":"XpTrKrqjOC","features":[152,179,848,12,499,408,405,377,228,222]}
{"label":"XpTrKrqjOC","features":[157,170,843,17,502,411,402,373,236,219]}
{"label":"XpTrKrqjOC","features":[156,177,844,22,503,413,398,380,236,227]}
{"label":"XpTrKrqjOC","features":[157,172,847,22,504,416,401,379,238,222]}
{"label":"XpTrKrqjOC","features":[157,177,846,15,499,417,397,376,238,221]}
{"label":"XpTrKrqjOC","features":[155,176,846,14,508,410,400,370,229,225]}
{"label":"cOYHgaxByT","features":[230,1,190,985,173,483,178,216,601,309]}
{"label":"cOYHgaxByT","features":[235,6,188,985,170,486,183,216,605,312]}
{"label":"cOYHgaxByT","features":[235,2,188,985,171,478,175,216,600,314]}
{"label":"cOYHgaxByT","features":[234,-4,190,987,177,478,177,220,600,309]}
{"label":"cOYHgaxByT","features":[235,-1,191,983,172,478,180,219,598,306]}
{"label":"cOYHgaxByT","features":[234,-1,190,983,178,480,174,221,597,313]}
{"label":"cOYHgaxByT","features":[225,-4,195,990,170,479,181,221,602,307]}
{"label":"ZWmNqLVaIZ","features":[546,73,52,445,193,175,158,561,317,503]}
{"label":"ZWmNqLVaIZ","features":[551,69,52,440,198,172,154,566,312,504]}
{"label":"ZWmNqLVaIZ","features":[543,77,55,445,193,179,163,565,313,508]}
{"label":"ZWmNqLVaIZ","features":[550,72,56,443,193,180,161,563,319,502]}
{"label":"ZWmNqLVaIZ","features":[542,77,55,450,194,173,155,558,315,501]}
{"label":"ZWmNqLVaIZ","features":[543,72,57,450,191,176,156,560,318,508]}
{"label":"ZWmNqLVaIZ","features":[550,68,49,443,194,180,154,563,312,500]}