我在这里有这个长字符串,在文本文件中有类似1000
这样的行。我希望计算该文本文件中每个日期出现的频率。我想怎么办呢?
{"interaction":{"author":{"id":"53914918","link":"http:\/\/twitter.com\/53914918","name":"ITTIA","username":"s8c"},"content":"RT @fubarista: After thousands of years of wars I am not an optimist about peace. The US economy is totally reliant on war. It is the on ...","created_at":"Sun, 10 Jul 2011 08:22:16 +0100","id":"1e0aac556a44a400e07497f48f024000","link":"http:\/\/twitter.com\/s8c\/statuses\/89957594197803008","schema":{"version":2},"source":"oauth:258901","type":"twitter","tags":["attretail"]},"language":{"confidence":100,"tag":"en"},"salience":{"content":{"sentiment":4}},"twitter":{"created_at":"Sun, 10 Jul 2011 08:22:16 +0100","id":"89957594197803008","mentions":["fubarista"],"source":"oauth:258901","text":"RT @fubarista: After thousands of years of wars I am not an optimist about peace. The US economy is totally reliant on war. It is the on ...","user":{"created_at":"Mon, 05 Jan 2009 14:01:11 +0000","geo_enabled":false,"id":53914918,"id_str":"53914918","lang":"en","location":"Mouth of the abyss","name":"ITTIA","screen_name":"s8c","time_zone":"London","url":"https:\/\/thepiratebay.se"}}}
答案 0 :(得分:1)
使用RandomAccessFile和BufferedReader类来读取部分数据,你可以使用字符串解析来计算每个日期的频率......
答案 1 :(得分:1)
每个日期都有一些稳定的模式,如\ d \ d(Jan | Feb | ...)20 \ d \ d 所以你可以使用正则表达式提取这些日期(Java中的Pattern类) 然后你可以使用HashMap增加某个对的值,其中key是找到的日期。很抱歉没有代码,但我希望能帮助你:)
答案 2 :(得分:0)
我认为它是一个JSON
字符串你应该解析而不是匹配。
请参阅此示例HERE
答案 3 :(得分:0)
将所需的字符串复制到test.text并将其放在c盘中 工作代码,我使用了Pattern和Matcher类
模式中的我给出了你问的日期模式,你可以在这里查看模式
“(周日|周一|周二|周三|周四|周五|周六)[,] \ d \ d(1月| 2月| 3月| 4月| 5月| 6月| 7月| 8月| 9月| 10月| 11月| 12月)\ d \ d \ d \ d“
检查代码
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Test{
public static void main(String[] args) throws Exception {
FileReader fw=new FileReader("c:\\test.txt");
BufferedReader br=new BufferedReader(fw);
int i;
String s="";
do
{
i=br.read();
if(i!=-1)
s=s+(char)i;
}while(i!=-1);
System.out.println(s);
Pattern p=Pattern.compile
(
"(Sun|Mon|Tue|Wed|Thu|Fri|Sat)[,] \\d\\d (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \\d\\d\\d\\d"
);
Matcher m=p.matcher(s);
int count=0;
while(m.find())
{
count++;
System.out.println("Match number "+count);
System.out.println(s.substring(m.start(), +m.end()));
}
}
}
非常好
答案 4 :(得分:0)
您的输入字符串是JSON格式,因此我建议您使用JSON解析器,这使解析很多更容易,更重要健壮 !虽然可能需要几分钟才能进入JSON解析,但这是值得的。
之后,解析“created_at”标签。创建一个地图,其中您的日期为关键,您的计数为值,并写下如下内容:
int estimatedSize = 500; // best practice to avoid some HashMap resizing
Map<String, Integer> myMap = new HashMap<>(estimatedSize);
String[] dates = {}; // here comes your parsed data, draw it into the loop later
for (String nextDate : dates) {
Integer oldCount = myMap.get(nextDate);
if (oldCount == null) { // not in yet
myMap.put(nextDate, Integer.valueOf(1));
}
else { // already in
myMap.put(nextDate, Integer.valueOf(oldCount.intValue() + 1));
}
}