这个文件是我运行包含推文的项目时得到的结果,每条推文都有自己的ID,ID可能会出现多次,可能不会。
Start: 1458763849881
712044789663924224 RT @WHUFC_News: West Ham are the only team in the top half of the Premier League without a player in the England squad.
712044789663928320 RT @BlackPplVines: Don't say anything just RT
712044789659787266 the 100 is wild []
712044789630275584 RT @SincerelyTumblr: me trying to concentrate []
712044789630283776 RT @Marie_aguilar98: Bless the man that ends up with me because I have a big attitude along with a big temper []
712044789647147009 If you could visit any place in the World, where would it be? — Hawaii and Japan []
712044789659787268 ほんとやめて、視聴者を殺しに来るのは []
712044789647052800 Who wanna be nice enough to bring me some [[712044789663928320,"RT Don't say anything just RT ,0.0]]
712044789634637825 Y6 drama #for #against #debate @MissT_02 @Michelle_Hill[[712044789663928320,"RT Don't say anything just RT ,0.2316541497936268]]
我需要计算多次出现的推文总数? 并且仅出现过一次的推文总数?
此处推文的ID已从7120447896
开始,当推文再次出现时,它将具有相同的ID
像主要推文ID 712044789663928320一样,并在我写的最后一行再次出现
我该如何开始?因为我开始考虑在ID中进行循环但是我写得不好
FileReader reader =new FileReader("/home/user/results.txt");
Scanner scnr = new Scanner(reader);
HashMap< String, Integer>h=new HashMap<String, Integer>();
int lineNumber = 1;
while(scnr.hasNextLine()){
String line = scnr.nextLine();
h.put(line, i);
lineNumber++;
答案 0 :(得分:2)
使用扫描仪逐行解析文件。将ID放在List中(使用HashMap将ID放在其计数处)并检查此List是否已包含ID,如果不是,则将其添加到List(值为1),如果已在liste中,则计算ID再一次。
编辑:
HashMap<String, Integer> m = new HashMap<>();
//...
//for first time ID
m.put("ID as string",1);
//...
//Counting on more ID
m.replace("ID as string",m.get("ID as string")+1);
答案 1 :(得分:1)
你可以使用哈希映射,因为推文的id将是关键,并且时间推文发生的时间将是值...在读取文件并将其存储在哈希地图后,你可以很容易地找出多少已经针对特定的推文ID进行了推文..