阅读文本文件

时间:2010-01-18 02:36:13

标签: c++ file-io input

我正在尝试找出在C ++中读取大文本(至少5 mb)文件的最佳方法,考虑速度和效率。任何首选的类或功能以及为什么?

顺便说一下,我正在UNIX环境中专门运行。

3 个答案:

答案 0 :(得分:0)

流类(ifstream)实际上做得很好;假设您没有受到限制,否则请确保关闭sync_with_stdio(在ios_base::)中。您可以使用getline()直接读入std :: strings,但从性能角度来看,使用固定缓冲区作为char *(chars或old-school char []的向量)可能更快(风险/复杂度更高) )。

如果您愿意使用页面大小计算等来玩游戏,则可以使用mmap路线。我可能首先使用流类构建它,看看它是否足够好。

根据您对每行数据的处理,您可能会开始发现您的处理例程是优化点,而不是I / O.

答案 1 :(得分:0)

使用旧样式文件io。

fopen the file for binary read
fseek to the end of the file
ftell to find out how many bytes are in the file.
malloc a chunk of memory to hold all of the bytes + 1
set the extra byte at the end of the buffer to NUL.
fread the entire file into memory.
create a vector of const char *
push_back the address of the first byte into the vector.
repeatedly 
    strstr - search the memory block for the carriage control character(s).
    put a NUL at the found position
    move past the carriage control characters
    push_back that address into the vector
until all of the text in the buffer has been processed.

----------------
use the vector to find the strings,
and process as needed.
when done, delete the memory block
and the vector should self-destruct.

答案 2 :(得分:0)

如果您使用存储整数,浮点数和小字符串的文本文件,我的经验是FILEfopenfscanf已经足够快,您也可以直接获取数字。我认为内存映射是最快的,但它需要你编写代码来解析文件,这需要额外的工作。