我想取一个字符串,其中有可能重复的字符,并将字符串拆分为每个字符的单位。
所以例如
$('.titlewrap').click(function(){
var status = $(this).data('status');
$('#selstatus').selectedOption(status);
});
会变成
aaaabbbabbbaaaacccbbbbbbbbaaa
答案 0 :(得分:4)
一种简洁的方法是在char
s的迭代器上使用Itertools::group_by
:
extern crate itertools;
use itertools::Itertools;
fn main() {
let input = "aaaabbbabbbaaaacccbbbbbbbbaaa";
let output: Vec<String> = input
.chars()
.group_by(|&x| x)
.into_iter()
.map(|(_, r)| r.collect())
.collect();
assert_eq!(
output,
["aaaa", "bbb", "a", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
);
}
但是,这需要为每组字符创建新的Strings
。更有效的解决方案是将切片返回到原始字符串。
对先前解决方案的(hacky)修改产生以下结果:
let mut start = input;
let output: Vec<&str> = input
.chars()
.group_by(|&x| x)
.into_iter()
.map(|(_, r)| {
let len: usize = r.map(|c| c.len_utf8()).sum();
let (a, b) = start.split_at(len);
start = b;
a
})
.collect();
答案 1 :(得分:2)
如果您认为外部工具过度,可以这样做:
fn group_chars(mut input: &str) -> Vec<&str> {
fn first_different(mut chars: std::str::Chars) -> Option<usize> {
chars.next().map(|f| chars.take_while(|&c| c == f).fold(f.len_utf8(), |len, c| len + c.len_utf8()))
}
let mut output = Vec::new();
while let Some(different) = first_different(input.chars()) {
let (before, after) = input.split_at(different);
input = after;
output.push(before);
}
output
}
fn main() {
assert_eq!(
group_chars("aaaabbbébbbaaaacccbbbbbbbbaaa"),
["aaaa", "bbb", "é", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
);
}
或者你可以做一个迭代器:
pub struct CharGroups<'a> {
input: &'a str,
}
impl<'a> CharGroups<'a> {
pub fn new(input: &'a str) -> CharGroups<'a> {
CharGroups { input }
}
}
impl<'a> Iterator for CharGroups<'a> {
type Item = &'a str;
fn next(&mut self) -> Option<&'a str> {
self.input.chars().next().map(|f| {
let i = self.input.find(|c| c != f).unwrap_or(self.input.len());
let (before, after) = self.input.split_at(i);
self.input = after;
before
})
}
}
fn main() {
assert_eq!(
CharGroups::new("aaaabbbébbbaaaacccbbbbbbbbaaa").collect::<Vec<_>>(),
["aaaa", "bbb", "é", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
);
}