对RSS提要进行字数统计

时间:2016-02-15 06:40:09

标签: java rss

所以我制作了一个从NPR的RSS源获取数据的程序,现在我很好奇如何在Feed中的描述中对字数进行频率计数这里是我和我的两个程序#&# 39;我试图巩固。

    package twp.brady.barry;
import java.io.BufferedReader;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.io.*;
import java.net.*;

public class RSSReader {





    public static String readRSS(String urlAddress){
        try{
            URL rssUrl = new URL(urlAddress);
            BufferedReader in = new BufferedReader(new InputStreamReader(rssUrl.openStream()));
            String sourceCode = "";
            String line;
            while((line = in.readLine())!=null){
                if (line.contains("<title>")){
                    int firstPos = line.indexOf("<title>");
                    String temp = line.substring(firstPos);
                    temp = temp.replace("<title>","");
                    int lastPos = temp.indexOf("</title>");
                    temp = temp.substring(0,lastPos);
                    sourceCode += temp+"\n";
                }
                if (line.contains("<description>")){
                    int firstPos = line.indexOf("<description>");
                    String temp = line.substring(firstPos);
                    temp = temp.replace("<description>","");
                    int lastPos = temp.indexOf("</description>");
                    temp = temp.substring(0,lastPos);
                    sourceCode += temp+"\n";

                }
            }
            in.close();
            return sourceCode;}
            catch(MalformedURLException ue){
                System.out.println("Malformed URL");
            }
            catch(IOException ioe){
                System.out.println("Something went wrong reading the comments");
            }
        return urlAddress;


}

    public static void main(String[] args){

        System.out.println(readRSS("http://www.npr.org/rss/rss.php?id=1001"));
    }
}

这是我发现的帮助进行字数统计的程序

package twp.brady.barry;

public class FrequencyOfWords 
{

    public static void main(String[]args)
    {
        String text = "apples are apples and I love them";
        String[] keys = text.split(" ");
        String[] uniqueKeys;
        int count = 0;

        uniqueKeys = getUniqueKeys(keys);

        for(String key: uniqueKeys)
        {
            if(null == key)
            {
                break;
            }
            for(String s : keys)
            {
                if(key.equals(s))
                {
                    count++;
                }
            }
            System.out.println("Count of ["+key+"] is : "+count);
            count=0;
        }
    }
    private static String[] getUniqueKeys(String[] keys)
    {
        String[] uniqueKeys = new String[keys.length];

        uniqueKeys[0] = keys[0];
        int uniqueKeyIndex = 1;
        boolean keyAlreadyExists = false;

        for(int i=1; i<keys.length ; i++)
        {
            for(int j=0; j<=uniqueKeyIndex; j++)
            {
                if(keys[i].equals(uniqueKeys[j]))
                {
                    keyAlreadyExists = true;
                }
            }

            if(!keyAlreadyExists)
            {
                uniqueKeys[uniqueKeyIndex] = keys[i];
                uniqueKeyIndex++;
            }
            keyAlreadyExists = false;
        }
        return uniqueKeys;
    }
}

我无法找到将两者结合起来的方法,以便我可以从RSS描述中获取字数。

1 个答案:

答案 0 :(得分:0)

您的意思是只计算整个RSS Feed中的单词吗? 然后你只需要替换你的行

String text = "apples are apples and I love them";

通过

String text = readRSS("http://www.npr.org/rss/rss.php?id=1001");

如果要计算每个描述/标题中单词的外观,则必须调整readRSS方法,以便不迭代整个RSS提要,而是在每个标题/描述组合之后停止并计算它们的单词。 / p>