我正在Rust中实现扫描仪。我在scan
结构上有一个Scanner
方法,该方法将字符串切片作为源代码,将该字符串分解为一个Vec<&str>
的UTF-8字符(使用板条箱{{3} }),然后将每个字符委托给scan_token
方法,该方法确定其词法标记并返回它。
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
struct Scanner {
start: usize,
current: usize,
}
#[derive(Debug)]
struct Token<'src> {
lexeme: &'src [&'src str],
}
impl Scanner {
pub fn scan<'src>(&mut self, source: &'src str) -> Vec<Token<'src>> {
let mut offset = 0;
let mut tokens = Vec::new();
// break up the code into UTF8 graphemes
let chars: Vec<&str> = source.graphemes(true).collect();
while let Some(_) = chars.get(offset) {
// determine which token this grapheme represents
let token = self.scan_token(&chars);
// push it to the tokens array
tokens.push(token);
offset += 1;
}
tokens
}
pub fn scan_token<'src>(&mut self, chars: &'src [&'src str]) -> Token<'src> {
// get this lexeme as some slice of the slice of chars
let lexeme = &chars[self.start..self.current];
let token = Token { lexeme };
token
}
}
fn main() {
let mut scanner = Scanner {
start: 0,
current: 0,
};
let tokens = scanner.scan("abcd");
println!("{:?}", tokens);
}
我收到的错误是:
error[E0597]: `chars` does not live long enough
--> src/main.rs:22:42
|
22 | let token = self.scan_token(&chars);
| ^^^^^ borrowed value does not live long enough
...
28 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'src as defined on the method body at 15:17...
--> src/main.rs:15:17
|
15 | pub fn scan<'src>(&mut self, source: &'src str) -> Vec<Token<'src>> {
| ^^^^
我想我理解为什么此方法不起作用的逻辑:该错误明确表明chars
的生存期应与生存期'src
一样长,因为tokens
包含切片引用放入chars
中的数据中。
我不了解的是,由于chars
只是对象的引用切片,确实的生存期为'src
(即{{1} }),为什么source
被删除后tokens
无法引用该数据?我是低级编程的新手,我认为关于引用和生命周期的直觉可能会被打破。
答案 0 :(得分:1)
您的问题可以减少为:
maxlength
pub fn scan<'a>(source: &'a str) -> Option<&'a str> {
let chars: Vec<&str> = source.split("").collect();
scan_token(&chars)
}
pub fn scan_token<'a>(chars: &'a [&'a str]) -> Option<&'a str> {
chars.last().cloned()
}
error[E0597]: `chars` does not live long enough
--> src/lib.rs:3:17
|
3 | scan_token(&chars)
| ^^^^^ borrowed value does not live long enough
4 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 1:13...
--> src/lib.rs:1:13
|
1 | pub fn scan<'a>(source: &'a str) -> Option<&'a str> {
| ^^
函数要求对切片的引用和切片内的引用的生存期为相同:scan_token
。由于&'a [&'a str]
的生存期较短,因此统一生存期必须如此。但是,向量的生存期不足以返回该值。
删除不必要的寿命:
Vec
将这些更改应用于完整的代码,您会发现pub fn scan_token<'a>(chars: &[&'a str]) -> Option<&'a str>
的定义中重复了核心问题:
Token
这种构造绝对不可能使您的代码按原样编译-没有切片的矢量能像切片一样长。您根本无法使用这种形式的代码。
您可以传递对struct Token<'src> {
lexeme: &'src [&'src str],
}
的可变引用以用作存储,但这是非常不寻常的,并且当您尝试执行任何操作时会遇到很多缺点更大:
Vec
您可能只希望impl Scanner {
pub fn scan<'src>(&mut self, source: &'src str, chars: &'src mut Vec<&'src str>) -> Vec<Token<'src>> {
// ...
chars.extend(source.graphemes(true));
// ...
while let Some(_) = chars.get(offset) {
// ...
let token = self.scan_token(chars);
// ...
}
// ...
}
// ...
}
fn main() {
// ...
let mut chars = Vec::new();
let tokens = scanner.scan("abcd", &mut chars);
// ...
}
是Token
另请参阅: