如何将字符串拆分为每个字符的单位

时间:2018-03-04 19:34:31

标签: rust

我想取一个字符串,其中有可能重复的字符,并将字符串拆分为每个字符的单位。

所以例如

$('.titlewrap').click(function(){
    var status = $(this).data('status');
    $('#selstatus').selectedOption(status);
});

会变成

aaaabbbabbbaaaacccbbbbbbbbaaa

2 个答案:

答案 0 :(得分:4)

一种简洁的方法是在char s的迭代器上使用Itertools::group_by

extern crate itertools;

use itertools::Itertools;

fn main() {
    let input = "aaaabbbabbbaaaacccbbbbbbbbaaa";

    let output: Vec<String> = input
        .chars()
        .group_by(|&x| x)
        .into_iter()
        .map(|(_, r)| r.collect())
        .collect();

    assert_eq!(
        output,
        ["aaaa", "bbb", "a", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
    );
}

但是,这需要为每组字符创建新的Strings。更有效的解决方案是将切片返回到原始字符串。

对先前解决方案的(hacky)修改产生以下结果:

let mut start = input;
let output: Vec<&str> = input
    .chars()
    .group_by(|&x| x)
    .into_iter()
    .map(|(_, r)| {
        let len: usize = r.map(|c| c.len_utf8()).sum();
        let (a, b) = start.split_at(len);
        start = b;
        a
    })
    .collect();

答案 1 :(得分:2)

如果您认为外部工具过度,可以这样做:

fn group_chars(mut input: &str) -> Vec<&str> {
    fn first_different(mut chars: std::str::Chars) -> Option<usize> {
        chars.next().map(|f| chars.take_while(|&c| c == f).fold(f.len_utf8(), |len, c| len + c.len_utf8()))
    }

    let mut output = Vec::new();

    while let Some(different) = first_different(input.chars()) {
        let (before, after) = input.split_at(different);
        input = after;
        output.push(before);
    }

    output
}

fn main() {
    assert_eq!(
        group_chars("aaaabbbébbbaaaacccbbbbbbbbaaa"),
        ["aaaa", "bbb", "é", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
    );
}

或者你可以做一个迭代器:

pub struct CharGroups<'a> {
    input: &'a str,
}

impl<'a> CharGroups<'a> {
    pub fn new(input: &'a str) -> CharGroups<'a> {
        CharGroups { input }
    }
}

impl<'a> Iterator for CharGroups<'a> {
    type Item = &'a str;

    fn next(&mut self) -> Option<&'a str> {
        self.input.chars().next().map(|f| {
            let i = self.input.find(|c| c != f).unwrap_or(self.input.len());
            let (before, after) = self.input.split_at(i);
            self.input = after;
            before
        })
    }
}

fn main() {
    assert_eq!(
        CharGroups::new("aaaabbbébbbaaaacccbbbbbbbbaaa").collect::<Vec<_>>(),
        ["aaaa", "bbb", "é", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
    );
}