Question

我有一个文件，其中包含一些复杂数据类型的多个实例（考虑事件的轨迹）。读取此文件的API用C编写，我对此没有太多控制。要将其暴露给Rust，我实现了以下接口：

// a single event read from the file
struct Event {
    a: u32,
    b: f32,
}

// A handle to the file used for I/O
struct EventFile;

impl EventFile {
    fn open() -> Result<EventFile, Error> {
        unimplemented!()
    }

    // read the next step of the trajectory into event
    fn read(&self, event: &mut Event) -> Result<(), Error> {
        event.a = unimplemented!();
        event.b = unimplemented!();
    }
}

要访问文件内容，我可以调用read函数，直到它返回类似于以下内容的Err：

let event_file = EventFile::open();
let mut event = Event::new();

let mut result = event_file.read(&mut event);
while let Ok(_) = result {
    println!("{:?}", event);
    result = event_file.read(&mut event);
}

由于事件被read的每次调用所重用，因此不会重复分配/取消分配内存，从而有望提高性能（事件结构在实际实现中要大得多）。

现在，很高兴能够通过迭代器访问此数据。但是，据我了解，这意味着每次迭代器产生时，我都必须创建一个Event的新实例-因为我无法在迭代器中重用该事件。这会损害性能：

struct EventIterator {
    event_file: EventFile,
}
impl Iterator for EventIterator {
    type Item = Event;
    fn next(&mut self) -> Option<Event> {
        let mut event = Event::new(); // costly allocation
        let result = self.event_file.read(&mut event);
        match result {
            Ok(_) => Some(event),
            Err(_) => None,
        }
    }
}

let it = EventIterator { event_file };
it.map(|event| unimplemented!())

是否有办法以某种方式在迭代器中“回收”或“重用”事件？还是这个概念根本无法转移到Rust，在这种情况下，我必须使用迭代器才能获得更好的性能？

Answer 1

您可以通过将Item包装在参考计数器中来在迭代之间“回收”项目。这里的想法是，如果调用者在两次迭代之间保留该项目，则迭代器将分配一个新对象并返回该新对象。如果调用者在下一次迭代开始之前删除了该项目，则该项目将被回收。 std::rc::Rc::get_mut()确保了这一点，如果引用计数恰好为1，则只会返回引用。

这有一个缺点，即您的Iterator会产生Rc<Foo>而不是Foo。由于引用计数，还增加了代码的复杂性，并且（也许）增加了运行时成本（如果编译器可以证明这一点，则可以完全消除）。

因此，您将需要衡量这是否确实使您获得了性能上的胜利。每次迭代都分配一个新对象似乎很昂贵，但是分配器擅长于此...

某事

use std::rc::Rc;

#[derive(Default)]
struct FoobarIterator {
    item: Rc<String>,
}

impl Iterator for FoobarIterator {
    type Item = Rc<String>;

    fn next(&mut self) -> Option<Self::Item> {
        let item = match Rc::get_mut(&mut self.item) {
            Some(item) => {
                // This path is only taken if the caller
                // did not keep the item around
                // so we are the only reference-holder!
                println!("Item is re-used!");
                item   
            },
            None => {
                // Let go of the item (the caller gets to keep it)
                // and create a new one
                println!("Creating new item!");
                self.item = Rc::new(String::new());
                Rc::get_mut(&mut self.item).unwrap()
            }
        };
        // Create the item, possible reusing the same allocation...
        item.clear();
        item.push('a');
        Some(Rc::clone(&self.item))
    }
}

fn main() {
    // This will only print "Item is re-used"
    // because `item` is dropped before the next cycle begins
    for item in FoobarIterator::default().take(5) {
        println!("{}", item);
    }

    // This will allocate new objects every time
    // because the Vec retains ownership.
    let _: Vec<_> = FoobarIterator::default().take(5).collect();
}

Answer 2

在这种情况下，编译器（或LLVM）很可能会使用return value optimization，因此您无需自己进行过早优化。

请参见this Godbolt example，尤其是第43至47行。我对汇编的理解是有限的，但似乎next()只是将Event值写入调用者通过a传递的内存中。指针（最初在rdi中）。在随后的循环迭代中，可以重复使用此内存位置。

请注意，如果您在没有-O标志的情况下进行编译（例如，以“调试”模式而不是“发行”模式进行编译），则会得到更长的程序集输出（我没有深入分析）。

迭代器中的“回收”项目可提高性能

2 个答案: