字符串java中子串的总计数

时间:2016-04-12 16:42:07

标签: java string twitter substring

我有一个程序,它从Twitter获取包含特定单词的推文,并搜索每条推文,以计算与该主题相关的另一个单词的出现次数(例如,在这种情况下,主要单词是卡梅伦和它的's' s寻找税和巴拿马。)我有它的工作,因此它适用于特定的推文,但我似乎无法弄清楚如何得到所有事件的累积计数。当单词出现时,我已经开始使用递增变量,但它似乎不起作用。代码如下,我出于显而易见的原因取出了我的Twitter API密钥。

public class TwitterWordCount {

    public static void main(String[] args) {
        ConfigurationBuilder configBuilder = new ConfigurationBuilder();
        configBuilder.setOAuthConsumerKey(XXXXXXXXXXXXXXXXXX);
        configBuilder.setOAuthConsumerSecret(XXXXXXXXXXXXXXXXXX);
        configBuilder.setOAuthAccessToken(XXXXXXXXXXXXXXXXXX);
        configBuilder.setOAuthAccessTokenSecret(XXXXXXXXXXXXXXXXXX);

        //create instance of twitter for searching etc.
        TwitterFactory tf = new TwitterFactory(configBuilder.build());
        Twitter twitter = tf.getInstance();

        //build query
        Query query = new Query("cameron");

        //number of results pulled each time
        query.setCount(100);

        //set the language of the tweets that we want
        query.setLang("en");

        //Execute the query
        QueryResult result;
        try {
            result = twitter.search(query);

            //Get the results
            List<Status> tweets = result.getTweets();

            //Print out the information
            for (Status tweet : tweets) {
                //get information about the tweet
                String userName = tweet.getUser().getName();
                long userId = tweet.getUser().getId();
                Date creationDate = tweet.getCreatedAt();
                String tweetText = tweet.getText();

                //print out the information
                System.out.println();
                System.out.println("Tweeted by " + userName + "(" + userId + ") on date " + creationDate);
                System.out.println("Tweet: " + tweetText);
                // System.out.println();
                String s = tweetText;
                Pattern pattern = Pattern.compile("\\w+");
                Matcher matcher = pattern.matcher(s);
                while (matcher.find()) {
                    System.out.print(matcher.group() + " ");

                }

                String str = s;
                String findStr = "tax";
                int lastIndex = 0;
                int count = 0;
                //int countall = 0;

                while (lastIndex != -1) {
                    lastIndex = str.indexOf(findStr, lastIndex);

                    if (lastIndex != -1) {
                        count++;
                        lastIndex += findStr.length();
                        //countall++;
                    }
                }

                System.out.println();
                System.out.println(findStr + " = " + count);

                String two = tweetText;

                String str2 = two;
                String findStr2 = "panama";
                int lastIndex2 = 0;
                int count2 = 0;

                while (lastIndex2 != -1) {
                    lastIndex2 = str2.indexOf(findStr2, lastIndex2);

                    if (lastIndex2 != -1) {
                        count++;
                        lastIndex2 += findStr.length();
                    }

                    System.out.println(findStr2 + " = " + count2);
                }
            }
        }
        catch (TwitterException ex) {
            ex.printStackTrace();
        }
    }
}

我也知道这绝对不是最干净的节目,它正在进行中!

1 个答案:

答案 0 :(得分:1)

您必须在for循环之外定义计数变量。

int countKeyword1 = 0;
int countKeyword2 = 0;

for (Status tweet : tweets) {

    //increase count variables in you while loops

}

System.out.Println("Keyword1 occurrences : " + countKeyword1 );
System.out.Println("Keyword2 occurrences : " + countKeyword2 );
System.out.Println("All occurrences : " + (countKeyword1 + countKeyword2) );