我正在尝试解决这个问题:
Get document on some condition in elastic search java API
我的逻辑是首先我们得到所有月份的位置,在字符串中,之后我提取下一个单词,即4位数或2位数年份,然后使用此计算差异。
为了获得几个月的位置,我正在使用这段代码: -
String[] threeMonthArray=new String[]{" Jan "," Feb "," Mar "," Apr "," May "," June "," July "," Aug "," Sep "," Oct "," Nov "," Dec "};
String[] completeMonthArray=new String[]{"January","Feburary","March","April","May","June","July","Augest","September","October","November","December"};
List indexArray=new ArrayList();
for(int i=0;i<threeMonthArray.length;i++){
int index = parsedContent.toLowerCase().indexOf(threeMonthArray[i].toLowerCase());
while (index >= 0) {
System.out.println(threeMonthArray[i]+" : "+index+"------");
indexArray.add(index);
index = parsedContent.toLowerCase().indexOf(threeMonthArray[i].toLowerCase(), index + 1);
}
// System.out.println(threeMonthArray[i]+" : "+parsedContent.toLowerCase().indexOf(threeMonthArray[i].toLowerCase())+"------");
}
Collections.sort(indexArray);
System.out.println( indexArray);
它显示了这个输出: -
[2873, 2884, 3086, 3098, 4303, 4315, 6251, 6262, 8130, 8142, 15700, 15711]
我的位置正确。我的问题是如何获得必须是数字的下一个单词。
Jun 2010 to Sep 2011 First Document
Jun 2009 to Aug 2011 Second Document
Nov 2011 – Sep 2012 Third Document
Nov 2012- Sep 2013 Forth Document
答案 0 :(得分:1)
您可以使用正则表达式查找从上次找到的月份开始的下一个数字:
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(parsedContent);
if (m.find(index)) {
String year = m.group();
}