计算多次出现的推文总数

时间:2016-03-25 19:57:18

标签: java

这个文件是我运行包含推文的项目时得到的结果,每条推文都有自己的ID,ID可能会出现多次,可能不会。

Start: 1458763849881
712044789663924224  RT @WHUFC_News: West Ham are the only team in the top half of the Premier League without a player in the England squad. 
712044789663928320  RT @BlackPplVines: Don't say anything just RT 
712044789659787266  the 100 is wild []
712044789630275584  RT @SincerelyTumblr: me trying to concentrate []
712044789630283776  RT @Marie_aguilar98: Bless the man that ends up with me because I have a big attitude along with a big temper []
712044789647147009  If you could visit any place in the World, where would it be? — Hawaii and Japan []
712044789659787268  ほんとやめて、視聴者を殺しに来るのは  []
712044789647052800  Who wanna be nice enough to bring me some     [[712044789663928320,"RT  Don't say anything just RT ,0.0]]
712044789634637825  Y6 drama #for #against #debate @MissT_02 @Michelle_Hill[[712044789663928320,"RT  Don't say anything just RT ,0.2316541497936268]]

我需要计算多次出现的推文总数? 并且仅出现过一次的推文总数?

此处推文的ID已从7120447896开始,当推文再次出现时,它将具有相同的ID

像主要推文ID 712044789663928320一样,并在我写的最后一行再次出现

我该如何开始?因为我开始考虑在ID中进行循环但是我写得不好

FileReader reader =new FileReader("/home/user/results.txt");
    Scanner scnr = new Scanner(reader);
 HashMap< String, Integer>h=new HashMap<String, Integer>();
    int lineNumber = 1;
    while(scnr.hasNextLine()){
        String line = scnr.nextLine();
       h.put(line, i);
        lineNumber++;

2 个答案:

答案 0 :(得分:2)

使用扫描仪逐行解析文件。将ID放在List中(使用HashMap将ID放在其计数处)并检查此List是否已包含ID,如果不是,则将其添加到List(值为1),如果已在liste中,则计算ID再一次。

编辑:

HashMap<String, Integer> m = new HashMap<>();
//...
//for first time ID
m.put("ID as string",1);
//...
//Counting on more ID
m.replace("ID as string",m.get("ID as string")+1);

答案 1 :(得分:1)

你可以使用哈希映射,因为推文的id将是关键,并且时间推文发生的时间将是值...在读取文件并将其存储在哈希地图后,你可以很容易地找出多少已经针对特定的推文ID进行了推文..