创建正则表达式匹配数组

时间:2011-05-16 16:28:28

标签: java regex

在Java中,我试图将所有正则表达式匹配返回到数组,但似乎只能检查模式是否匹配(boolean)。

如何使用正则表达式匹配来形成与给定字符串中的正则表达式匹配的所有字符串数组?

6 个答案:

答案 0 :(得分:240)

4castle's answer如果你可以假设Java> = 9)

,则优于以下内容

您需要创建一个匹配器并使用它来迭代查找匹配项。

 import java.util.regex.Matcher;
 import java.util.regex.Pattern;

 ...

 List<String> allMatches = new ArrayList<String>();
 Matcher m = Pattern.compile("your regular expression here")
     .matcher(yourStringHere);
 while (m.find()) {
   allMatches.add(m.group());
 }

在此之后,allMatches包含匹配项,如果您确实需要,可以使用allMatches.toArray(new String[0])来获取数组。


您还可以使用MatchResult编写辅助函数来循环匹配 因为Matcher.toMatchResult()返回当前组状态的快照。

例如,你可以写一个懒惰的迭代器来让你做

for (MatchResult match : allMatches(pattern, input)) {
  // Use match, and maybe break without doing the work to find all possible matches.
}

做这样的事情:

public static Iterable<MatchResult> allMatches(
      final Pattern p, final CharSequence input) {
  return new Iterable<MatchResult>() {
    public Iterator<MatchResult> iterator() {
      return new Iterator<MatchResult>() {
        // Use a matcher internally.
        final Matcher matcher = p.matcher(input);
        // Keep a match around that supports any interleaving of hasNext/next calls.
        MatchResult pending;

        public boolean hasNext() {
          // Lazily fill pending, and avoid calling find() multiple times if the
          // clients call hasNext() repeatedly before sampling via next().
          if (pending == null && matcher.find()) {
            pending = matcher.toMatchResult();
          }
          return pending != null;
        }

        public MatchResult next() {
          // Fill pending if necessary (as when clients call next() without
          // checking hasNext()), throw if not possible.
          if (!hasNext()) { throw new NoSuchElementException(); }
          // Consume pending so next call to hasNext() does a find().
          MatchResult next = pending;
          pending = null;
          return next;
        }

        /** Required to satisfy the interface, but unsupported. */
        public void remove() { throw new UnsupportedOperationException(); }
      };
    }
  };
}

有了这个,

for (MatchResult match : allMatches(Pattern.compile("[abc]"), "abracadabra")) {
  System.out.println(match.group() + " at " + match.start());
}

产量

a at 0
b at 1
a at 3
c at 4
a at 5
a at 7
b at 8
a at 10

答案 1 :(得分:30)

在Java 9中,您现在可以使用Matcher#results()获取Stream<MatchResult>,您可以使用它来获取匹配列表/数组。

#PREFIX nobel: <http://data.nobelprize.org/terms/>
#PREFIX cat: <http://data.nobelprize.org/resource/category/>
#PREFIX foaf: <http://xmlns.com/foaf/0.1/>
#PREFIX dbo: <http://dbpedia.org/ontology/> 
#PREFIX dbp: <http://dbpedia.org/property/>
#PREFIX dbr: <http://dbpedia.org/resource/>
#PREFIX owl: <http://www.w3.org/2002/07/owl#>
#PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>
#PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

Select ?advisorName (max (?c) as ?m ) {#the answer is empty.. )
SELECT ?advisorName (count (distinct (?category)) as ?c) (max (?c) as ?m) { 
     ?student a nobel:Laureate ;
            owl:sameAs ?dbpStudent ;
            foaf:name ?studentName ;
            nobel:nobelPrize ?pStudent .          
     ?pStudent nobel:category ?category .   

FILTER (afn:namespace(?dbpStudent) = str(dbr:))        
SERVICE <http://dbpedia.org/sparql> {    
{ ?dbpStudent dbo:doctoralAdvisor ?dbpAdvisor .}
union
{?dbpAdvisor dbo:doctoralStudent ?dbpStudent.  }    
 ?dbpAdvisor rdfs:label ?advisorName  .
Filter (lang(?advisorName)= "en")   }}
group by ?dbpStudent ?advisorName
order by desc (?c)  }
import java.util.regex.Pattern;
import java.util.regex.MatchResult;

答案 2 :(得分:25)

Java使得regex过于复杂,并且它不遵循perl风格。查看MentaRegex,了解如何在一行Java代码中实现这一目标:

String[] matches = match("aa11bb22", "/(\\d+)/g" ); // => ["11", "22"]

答案 3 :(得分:8)

这是一个简单的例子:

Pattern pattern = Pattern.compile(regexPattern);
List<String> list = new ArrayList<String>();
Matcher m = pattern.matcher(input);
while (m.find()) {
    list.add(m.group());
}

(如果你有更多的捕获组,你可以通过索引引用它们作为组方法的参数。如果你需要一个数组,那么使用list.toArray()

答案 4 :(得分:5)

来自Official Regex Java Trails

        Pattern pattern = 
        Pattern.compile(console.readLine("%nEnter your regex: "));

        Matcher matcher = 
        pattern.matcher(console.readLine("Enter input string to search: "));

        boolean found = false;
        while (matcher.find()) {
            console.format("I found the text \"%s\" starting at " +
               "index %d and ending at index %d.%n",
                matcher.group(), matcher.start(), matcher.end());
            found = true;
        }

使用find并将结果group插入您的数组/列表/其他内容。

答案 5 :(得分:-1)

        Set<String> keyList = new HashSet();
        Pattern regex = Pattern.compile("#\\{(.*?)\\}");
        Matcher matcher = regex.matcher("Content goes here");
        while(matcher.find()) {
            keyList.add(matcher.group(1)); 
        }
        return keyList;