如何从字符串中提取括号数据

时间:2016-04-12 21:05:24

标签: java regex string matching

我正在尝试从下面的字符串中提取“rel =”next“'的链接。问题是四者的排序可能会发生变化,具体取决于是否存在“之前”或“下一个”的链接。因此,我不能使用正则表达式或拆分成字符串数组并可靠地获取链接。

这是字符串:

<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=0&per_page=100>; rel="first",<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=20&per_page=100>; rel="last",<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=1&per_page=100>; rel="next"

我需要得到这个字符串:

<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=1&per_page=100>; rel="next"

这是一个可读的版本:

<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=0&per_page=100>; rel="first",
<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=20&per_page=100>; rel="last",
<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=1&per_page=100>; rel="next"

最终只提取API请求的链接。我尝试按,分割数组,但是网址可能包含,,因此也不可靠。 谢谢!

2 个答案:

答案 0 :(得分:1)

String myString = "<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=0&per_page=100>; rel=\"first\",<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=20&per_page=100>; rel=\"last\",<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=1&per_page=100>; rel=\"next\"";
  try {
    Pattern regex = Pattern.compile("\"last\",(.*?)$");
    Matcher regexMatcher = regex.matcher(myString);
    if(regexMatcher.find()) {
        String next = regexMatcher.group(1);
        System.out.println(next);
    } 
   } catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
  }

//<http://v4-api.prod.emailanalyst.com/v4/competitive/search?Authorization={API_KEY}&mobileReady=true&qd=between:20150101000000,20150101060000&onlyCommercial=true&hasCreative=true&page=1&per_page=100>; rel="next"

REGEX说明:

"last",(.*?)$

Options: Case sensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Greedy quantifiers

Match the character string “"last",” literally (case sensitive) «"last",»
Match the regex below and capture its match into backreference number 1 «(.*?)»
   Match any single character that is NOT a line break character (line feed) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of the string, or before the line break at the end of the string, if any (line feed) «$»

<强>样本: http://ideone.com/7mITYJ

答案 1 :(得分:0)

假设元素始终以"<http:"开头,您可以使用具有正向前瞻的正则表达式:

String[] elements = str.split(",(?=<http:)");