测试实例始终归入同一类

时间:2017-07-10 15:38:53

标签: java machine-learning weka text-classification j48

我编写了一个使用训练集构建J48分类器的java程序。然后我将测试实例传递给分类器以预测未知类。但是每个测试实例都归类为class属性的第一个元素。

示例: - 如果我声明了类属性的元素,

  cls.add("positive");
  cls.add("negative"); 

按此顺序,每个实例分类为第一类元素,即正数。

如果我按此顺序声明class元素,

cls.add("negative");
cls.add("positive");

每个实例都归类为第一类元素,即负数。

我的训练集显示了weka探索的准确率超过90%。

public static void main(String[] args) throws Exception {
     StringToWordVector filter = new StringToWordVector();

    //training set
    BufferedReader reader;
    reader = new BufferedReader(new FileReader("D:/suicideTest.arff"));

    Instances train = new Instances(reader);
    train.setClassIndex(train.numAttributes() -1);
    filter.setInputFormat(train);
    train = Filter.useFilter(train, filter);

    reader.close();

    J48 nb = new J48();
    nb.buildClassifier(train);

    ArrayList cls = new ArrayList(2);

    cls.add("positive");
    cls.add("negative"); 

    Attribute clsAtt = new Attribute("class", cls);

    //ArrayList<String> tweet = new ArrayList(1);
    //String tweet = "";
    //Attribute tweetAtt = new Attribute("tweet", tweet);

    ArrayList allAtt = new ArrayList(2); 
    //allAtt.add(tweetAtt);
    allAtt.add(new Attribute("tweet", (FastVector) null));
    allAtt.add(clsAtt);


    // Create an empty test set
     Instances testSet = new Instances("", allAtt, 1);
    // Set class index
    testSet.setClassIndex(testSet.numAttributes() - 1);

    String names=  "Fox News Devotes a Whole Evening[']s Worth of Programming to Whining About CNNAgain (column by @justinbaragona)";
    Instance inst = new DenseInstance(2); 
    inst.setValue((Attribute)allAtt.get(0), names.toString());

    testSet.add(inst);
    System.out.println(testSet.instance(0).toString());
    double pred = nb.classifyInstance(testSet.instance(0));

    filter.setInputFormat(testSet);
    testSet = Filter.useFilter(testSet, filter);

    String predictString = testSet.classAttribute().value((int) pred);

    System.out.println(predictString);
}

我的训练集看起来像

@RELATION suicidalideation

@ATTRIBUTE tweet string
@ATTRIBUTE class {positive,negative}

@DATA

"is it weird to be tired of living. i just. don't want to exist for a while", positive
"I don't want to exist", positive
"i don't want to exist in this world without you", positive
"I simply just don't want to exist anymore.", positive
"I don't want to exist im fed up of all this shit", positive
"I am deeply sad that i don't want to exist now", positive
"It's feel like I don't want to exist to anybody life", positive
"fuck me up i don't want to exist for myself", positive
"I don't want to be alone, I don't want to be in my room, I don't want to exist", positive
"I just kind of don't want to exist right now cause stress is just too much", positive
"if i'm going to be a problem i don't want to exist", positive
"If he leaves me I don't want to go on anymore..will jump to running train ..", positive
"Im dead inside and I don't want to go on.", positive
"I don't want to go on living in this world without you", positive
"I hate my life and Don't want to go on further", positive
"I dont want to go on living in this world without you", positive
"Wance&Forall | Coming SoonnHow soon depends on you!nLIKE/FOLLOW Wance & Forall on "    , negative
"RT @ChampionsLeague: Zidane: Toughest opponent? Scholes. The complete midfielder; undoubtedly the greatest midfielder of his generation.[...]" , negative
"You are excited to participate in activities with people who m... "    , negative
"RT @pussypun: rub ur ass against his dick by accident on purpose"  , negative
"RT @NiallOfficial: Have a great 4th of July , America" , negative
"RT @thedrewpowell: Happy Birthday America! #FourthOfJuly"  , negative
"RT @virtuaIg: she a child sir "    , negative
"Happy 4th of July Fillies!  Good day to play some golf. Coach played 9. How bout you? "    , negative
"@TheEFCForum Fair enough. We will see. But I'm doubting I'm going to be wrong on this to be honest, if I am fair enough."  , negative
"RT @_EVANGELO: We are aware we weren't free on this date. Please shut the entire fuck up and EAT. Gaddamn! Every damn holiday y'all got som[...]"  , negative
"Langley's Abbey Fortin going to bat for Canada #Langley #bhivecan" , negative
"Daily Organic Multivitamins for Kids GLUTEN FREE - SUGAR FREE - VEGAN - KOSHER - HALAL VITAMIN SUPPLEMENTS  Nutra Pharmnmultivitamin has 45"   , negative
"RT @skinupgg:  ST SSG 08 DRAGONFIRE GIVEAWAY OH MY GOD nnIT'S SO LIT nTO ENTERnn[?] RETWEET & FOLLOWn[?] " , negative
"RT @DerekCressman: Happy July 4th. Let's Declare American Independence from Russian interference in our sovereignty"   , negative
"RT @wordstionary: Never let your past dictate your future. It's never too late to become better."  , negative
"RT @Jon_Wienke: Stopped to clean his sign. Nothing but respect for my president. " , negative
"@Ayindeemu Please help my friend win the @isokenmovie #IsokenMegaGiveaway contest by liking the tweet above    Please!n @_idarahh" , negative
"RT @politicques: What to The Slave is 4th of July? -- 1841 Speech by Frederick Douglass -- Courtesy of The Freeman Institute... "  , negative
"idk why that bothers me"   , negative
"RT @JPY_Kurdish: Stoke-on-Trent, England: [?] pedophile hunter group catches a Muslim man who wants to meet a 13-year-old girl for sex at Cen[...]"    , negative
"@Danny_Draws Hah, that's very Portland. I wonder if it affects water pressure. :)" , negative
"RT @CSGORoll: AK-47 | Case Hardened FTnn- RTn- Follown- Enable notifications on CSGORolln- Play "  , negative
"Nobody can go back... Maria Robinson #quote #quotes #inspiration #findyourhappily "    , negative
"RT @ManuBlinkVIP: Rosé talking about IU again! You're doing great sweetie, keep mentioning that, this collab must happen!" , negative
"need to be as lit as the fireworks tonight"    , negative
"@RyderDanielz @davey_sap ...I think he would be a great opponent for me at 'Chantz Of A Lifetime' for my titles."  , negative
"RT @RealNiggy_: well if you move i could see all my options "  , negative
"@Komangsrie21 polback donv"    , negative
"RT @PeadaPipper: Easy Breezy Beautiful.  " , negative

0 个答案:

没有答案