Convert regex Captures into HashMap in Rust?

时间:2019-01-18 18:16:09

标签: regex hashmap rust regex-group

I have a Regex with an unknown number of named groups with unknown names. I want to match a string to that regex, and get a HashMap<&str, &str> with the name of the groups as key and the captured strings as value.

How can I do this? Will I have to use regex.captures(str).iter() and then somehow map and filter and collect into a map? Or is there some shortcut?

2 个答案:

答案 0 :(得分:3)

It is tricky because the regex can have multiple matches, and each capture can be matched multiple times in a single global match.

Maybe something like this (playground):

fn main() {
    let re = Regex::new(r"(?P<y>\d{4})-(?P<m>\d{2})-(?P<d>\d{2})").unwrap();
    let text = "2012-03-14";
    let caps = re.captures(text).unwrap();
    let dict: HashMap<&str, &str> = re
        .capture_names()
        .flatten()
        .filter_map(|n| Some((n, caps.name(n)?.as_str())))
        .collect();
    println!("{:#?}", dict);
}

That outputs:

{
    "y": "2012",
    "d": "14",
    "m": "03"
}

The code is simple once you realize that the capture names are not available from the Match itself, but from the parent Regex. You have to do the following:

  1. Call capture_names(), that will be an iterable of Option<&str>.
  2. flatten() the iterable, that will remove the None and unwrap the &str values.
  3. filter_map() the capture names into a list of tuples (name, value) of type (&str, &str). The filter is needed to remove captures that are not present (thanks to @Anders).
  4. collect()! This just works because HashMap<K, V> implements the trait FromIterator<(K, V)>, so an iterator of (&str, &str) collects into a HasMap<&str, &str>.

答案 1 :(得分:2)

如果有多个捕获,可以将它们收集到这样的列表中:

library(stringr)

strsplit(
  strings, 
  paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\\K_"), 
  perl = TRUE)

# [[1]]
# [1] "aa_bb_cc" "dd_ee_ff"
# 
# [[2]]
# [1] "cc_hh" "ff_zz"
# 
# [[3]]
# [1] "bb" "dd"