是否可以将推文存储在csv文件中?

时间:2017-03-03 08:10:26

标签: java twitter4j

成功获取超过100条推文,但现在我无法将这些推文存储在 .csv 文件中? 尝试过文件处理类,以便如何存储推文?

public class SentimentAnalysisWithCount {

DoccatModel model;
static int positive = 0;
static int negative = 0;

public static void main(String[] args) throws IOException, TwitterException {
    String line = "";
    SentimentAnalysisWithCount twitterCategorizer = new SentimentAnalysisWithCount();
    twitterCategorizer.trainModel();

    ConfigurationBuilder cb = new ConfigurationBuilder();
    cb.setDebugEnabled(true)
        .setOAuthConsumerKey("--------------------------------------------------")
        .setOAuthConsumerSecret("--------------------------------------------------")
        .setOAuthAccessToken("--------------------------------------------------")
        .setOAuthAccessTokenSecret("--------------------------------------------------");
    TwitterFactory tf = new TwitterFactory(cb.build());
    Twitter twitter = tf.getInstance();
    Query query = new Query("udta punjab");
    QueryResult result = twitter.search(query);
    int result1 = 0;
    for (Status status : result.getTweets()) {
        result1 = twitterCategorizer.classifyNewTweet(status.getText());
        if (result1 == 1) {
            positive++;
        } else {
            negative++;
        }
    }

    BufferedWriter bw = new BufferedWriter(new FileWriter("C:\\Users\\User\\Desktop\\results.csv"));
    bw.write("Positive Tweets," + positive);
    bw.newLine();
    bw.write("Negative Tweets," + negative);
    bw.close();
}

public void trainModel() {
    InputStream dataIn = null;
    try {
        dataIn = new FileInputStream("C:\\Users\\User\\Downloads\\tweets.txt");
        ObjectStream lineStream = new PlainTextByLineStream(dataIn, "UTF-8");
        ObjectStream sampleStream = new DocumentSampleStream(lineStream);
        // Specifies the minimum number of times a feature must be seen
        int cutoff = 2;
        int trainingIterations = 30;
        model = DocumentCategorizerME.train("en", sampleStream, cutoff,
                trainingIterations);
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        if (dataIn != null) {
            try {
                dataIn.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

public int classifyNewTweet(String tweet) throws IOException {
    DocumentCategorizerME myCategorizer = new DocumentCategorizerME(model);
    double[] outcomes = myCategorizer.categorize(tweet);
    String category = myCategorizer.getBestCategory(outcomes);

    System.out.print("-----------------------------------------------------\nTWEET :" + tweet + " ===> ");
    if (category.equalsIgnoreCase("1")) {
        System.out.println(" POSITIVE ");
        return 1;
    } else {
        System.out.println(" NEGATIVE ");
        return 0;
    }

}
}

在此代码中,控制台上显示的推文应存储在.csv文件中

1 个答案:

答案 0 :(得分:1)

请从Stackoverflow中删除您的API密钥。你不应该公开发布它们。

可以在CSV中存储推文,您只需通过调整书面输出来增强已发布的代码片段。以下代码片段应该提供有关如何在Java 8中实现它的想法:

    try(BufferedWriter bw = new BufferedWriter(new FileWriter("C:\\Users\\User\\Desktop\\results.csv"))) {

    int positive = 0;
    int negative = 0;

    StringBuilder sb = new StringBuilder();
    for (Status status : result.getTweets()) {
        String tweetText = status.getText();
        long tweetId = status.getId();
        int classificationResult = twitterCategorizer.classifyNewTweet(tweetText);

        if (classificationResult == 1) {
            positive++;
        } else {
            negative++;
        }       

        sb.append("ID=").append(tweetId).append(",TEXT=").append(tweetText).append(",classificationResult=").append(classificationResult);

        String csvText = sb.toString();

        bw.write(csvText);
        bw.newLine();

        sb.delete(0,csvText);

    }

    bw.write("##### SUMMARY #####")
    bw.write("Positive Tweets," + positive);
    bw.newLine();
    bw.write("Negative Tweets," + negative);
    bw.close();

    }catch(IOException e) {
         //TODO Exception Handling
    }

results.csv看起来像是:

ID=25125125,TEXT=some fancy text here,classificationResult=1
ID=25146734725,TEXT=some fancy text1 here,classificationResult=0
ID=25127575125,TEXT=some fancy text2 here,classificationResult=1
ID=251258979125,TEXT=some fancy text3 here,classificationResult=0
ID=25125867125,TEXT=some fancy text4 here,classificationResult=1
##### SUMMARY #####
Positive Tweets,3
Negative Tweets,2