我需要获取Java中带有日期的URL之间的文本
输入1:
/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/
输出: testcustomer
仅保留/raw/
,date
将更改,testcustomer
将更改
输入2:
/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/
输出: newcustomer
String url = "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
String customer = getCustomer(url);
public String getCustomer (String _url){
String source = "default";
String regex = basePath + "/raw/\\d{4}-\\d{2}-\\d{2}/usr*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(_url);
if (m.find()) {
source = m.group(1);
} else {
logger.error("Cant get customer with regex " + regex);
}
return source;
}
正在返回'default'
:(
答案 0 :(得分:2)
您的正则表达式/raw/\\d{4}-\\d{2}-\\d{2}/usr*
缺少想要的值部分,您需要一个正则表达式来查找日期,并保留下一个内容:
/\w*/raw/[0-9-]+/(\w+)/.*
或(?<=raw\/\d{4}-\d{2}-\d{2}\/)(\w+)
会很好
Pattern p = Pattern.compile("/\\w*/raw/[0-9-]+/(\\w+)/.*");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1);
System.out.println(value);
}
或者如果它始终是第四部分,请使用split()
String value = str.split("/")[4];
System.out.println(value);
这里是>> code demo
答案 1 :(得分:2)
在这里,我们可能会使用raw
,后跟日期作为左边界,然后在捕获组中收集所需的输出,添加斜杠并使用字符串的其余部分,表达式类似于:
.+raw\/[0-9]{4}-[0-9]{2}-[0-9]{2}\/(.+?)\/.+
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = ".+raw\\/[0-9]{4}-[0-9]{2}-[0-9]{2}\\/(.+?)\\/.+";
final String string = "/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/\n"
+ "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
如果不需要此表达式或您希望对其进行修改,请访问regex101.com。
jex.im可视化正则表达式: