Question

我正在尝试使用https://github.com/davedelong/CHCSVParser

中的CHCSVParser将30mb CSV文件导入Core Data

它有效，设置非常简单，但它在解析整个文件时占用了大量内存。过度的内存使用似乎来自-nextCharacter的结尾，特别是对-substringWithRange:的调用

//return nil to indicate EOF or error
if ([currentChunk length] == 0) { return nil; }

NSRange charRange = [currentChunk rangeOfComposedCharacterSequenceAtIndex:chunkIndex];
NSString * nextChar = [currentChunk substringWithRange:charRange];
chunkIndex = charRange.location + charRange.length;
return nextChar;

我能够为每1,000,000个字符调用-drain的函数添加一个自动释放池，但随后吞吐量下降。

有没有人有任何其他想法？ Dave DeLong也许？： - ）

Answer 1

好的，所以我检查了一下你是对的，有明显的记忆累积。

我每次开始新的CSV行时都尝试放入一个池，然后在线完成时将其耗尽，但事实证明这对其他一些内存管理情况无效。

我最终做的是在-runParseLoop方法中放置一个池。该池在while循环之前是alloc d并且在之后排干。有一个unsigned short计数器在循环中递增，在循环内，我-drain并在计数器达到0时重新分配池。

本质：

NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
unsigned short counter = 0;
while (error == nil && 
       (currentCharacter = [self nextCharacter]) && 
       currentCharacter != nil) {
    //process the current character
    counter++;
    if (counter == 0) { //this happens every 65,536 (2**16) iterations when the unsigned short overflows
        //retain the characters that need to out-live this pool
        [pool drain];
        pool = [[NSAutoreleasePool alloc] init];
        //autorelease the characters
    }
}

[pool drain];

这是对溢出的有趣利用，是吗？：）

我对190MB的CSV文件进行了测试，内存使用率保持在合理的水平（几兆字节的活动内存）。

这些更改已被推送到github页面上的master分支。试试它们，让我知道它们是如何为你工作的。如果您仍然遇到内存/性能问题，请回来，我们可以尝试别的。

如何限制CHCSVParser的内存使用？

1 个答案: