异步读取大文件会导致iOS上的空读取块

时间:2014-01-26 20:45:36

标签: ios asynchronous nsinputstream

这让我很难过。我试图逐行读取iOS上的6MB CSV文件。我已经尝试过使用普通的C文件指针和NSInputStream轮询,但最终确定了下面的感觉最干净。所有这三种方法都会导致看似随机的读取块返回成功,但用所有空字节填充缓冲区。我说“随机”,但它具有一致性。重新运行程序时,读取停止在完全相同的位置工作,并且读取的数量是可疑的(更多内容如下)。

- (id)initWithFileAtPath:(NSString *)path {
   if ((self = [super init])) {
      filePath = [path copy];
      queue = [[NSOperationQueue alloc] init];
      queue.maxConcurrentOperationCount = 1;
      buffer = [[NSMutableString alloc] init];
      bytes = malloc(CHUNK_SIZE * sizeof(UTF8Char));
   }

   return self;
}

- (void)dealloc {
   [filePath release];
   [queue release];
   [buffer release];
   free(bytes);
   [super dealloc];
}

- (void)stream:(NSInputStream *)stream handleEvent:(NSStreamEvent)eventCode {
   switch (eventCode) {
      case NSStreamEventOpenCompleted:
         break;
      case NSStreamEventHasBytesAvailable:
         [queue addOperationWithBlock:^{
            [self readChunk: stream];
            [self drainBuffer];
         }];
         break;
      case NSStreamEventEndEncountered:
         if ([buffer length] > 0) {
            [delegate reader:self didReadLine:[NSString stringWithString:buffer]];
            [buffer setString:@""];
         }

         [stream close];
         [stream removeFromRunLoop:[NSRunLoop currentRunLoop]
                           forMode:NSDefaultRunLoopMode];

         [stream release];

         [delegate readerDidFinishReading:self];

         break;
      default:
         NSLog(@"StreamReader: event %d", eventCode);
         break;
   }
}

- (void)enumerateLines {
   NSInputStream *stream = [[NSInputStream alloc] initWithFileAtPath:filePath];
   stream.delegate = self;

   [stream scheduleInRunLoop:[NSRunLoop currentRunLoop]
                     forMode:NSDefaultRunLoopMode];

   [stream open];
}

- (void)readChunk: (NSInputStream*)stream {
   NSInteger readSize = [stream read:bytes maxLength:CHUNK_SIZE];
   if (readSize) {
      if (bytes[0] == '\0') {
         NSLog(@"null buffer %d", readSize);
      }
      NSString *string = [[NSString alloc] initWithBytes:bytes
                                                  length:readSize
                                                encoding:NSUTF8StringEncoding];
      [buffer appendString:string];
      [string release];
   } else {
      NSLog(@"StreamReader: read zero bytes");
   }
}

- (void)drainBuffer {
   static NSCharacterSet *newlines = nil;
   if (newlines == nil) {
      newlines = [NSCharacterSet newlineCharacterSet];
   }

   NSRange newlinePos;
   while ((newlinePos = [buffer rangeOfCharacterFromSet:newlines]).location != NSNotFound) {
      NSString *line = [buffer substringToIndex:newlinePos.location];

      // remove the line from the buffer along with line separator
      [buffer deleteCharactersInRange: (NSRange){0, [line length]}];
      while ([buffer length] > 0 && [newlines characterIsMember:[buffer characterAtIndex:0]]) {
         [buffer deleteCharactersInRange:(NSRange){0, 1}];
      }

      [delegate reader:self didReadLine: line];
   }
}

在读取6MB文件时,两次当CHUNK_SIZE为1024时,我将得到一系列96次“错误读取”。如果CHUNK_SIZE为512,则会有一系列192次“错误读取”。 “坏读”是什么意思? NSInputStream读取消息返回成功,并且委托回调中不会发生错误事件。然而bytes缓冲区具有所有空值。

  • iOS 7.0.4,iPad 2
  • 不会发生在桌面上
  • 不会发生在模拟器中
  • 将文件大小减小到aprox。 1MB“修复”iPad上的问题

最有可能值得注意的是,我在主UI线程上实例化了读者类。

所以...我在这里巧妙地(或不巧妙地)做错了吗?或者我发现了某种模糊的iOS错误?

1 个答案:

答案 0 :(得分:0)

至少有一个问题是你正在读取UTF8流的随机块,然后假设你得到的内容是连贯的。如果你得到一块在UTF8编码中间“断开”的字符串,它将导致一系列问题。如果你想做部分字符串构造,你的算法将需要返工来防止这种情况 - 这不是一件容易的事情。