Question

我正试图通过c ++读取一个巨大的txt。它有70mb。我的目标是逐行子串并生成另一个只包含我需要的信息的较小的txt。

我得到下面的代码来读取文件。它适用于较小的文件，但不适用于70mb怪物。

#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main()
{
  ifstream myReadFile;
  myReadFile.open("C:/Users/Lucas/Documents/apps/COTAHIST_A2010.txt");
  char output[100];
  if (myReadFile.is_open()) {
    while (myReadFile.eof()!=1) {
         myReadFile >> output;
         cout<<output;
         cout<<"\n";
     }


    }
  system("PAUSE");
  return 0;
}

这是我得到的错误： SeparadorDeAcoes.exe中0x50c819bc（msvcp100d.dll）的未处理异常：0xC0000005：访问冲突读取位置0x3a70fcbc。

如果有人可以用C或甚至C＃指出解决方案，那也是可以接受的！

谢谢=）

Answer 1

您的char output[100]缓冲区无法获取其中一行的内容。

理想情况下，您应该使用字符串目标，而不是char[]缓冲区。

编辑正如已经指出的那样，这是不好的做法，导致读取最后一行两次或一个迷路的空最后一行。更正确的循环写法将是：

string output;
while (getline(myReadFile, output)) {
  cout<<output<<"\n";
}

**编辑 - 在这里留下坏的，邪恶的代码：

快速重写内部while循环可能是：

string output;
while (myReadFile.good()) {
  getline(myReadFile, output);
  cout<<output<<"\n";
}

Answer 2

我认为你的问题是你的一条线长度超过100个字符。需要增加字符数组的大小。

Answer 3

您没有使用std::string，但是您包含了头文件。决定。使用std::string或字符数组。

此外，使用std::istream::read并向函数提供数组的大小。你需要重复多次，因为100个字符远小于70mb。

尝试使用动态内存分配更大的数组：

const unsigned int array_size = 1024 * 1024 * 1024;

int main(void)
{
  char * output;
//...
  output = new char [array_size];
// read into output
// ...
// clean up
  delete [] output;
  return EXIT_SUCCESS;
}

如果使用std::string，请使用带有size参数的构造函数，以便指定字符串的初始大小。

从C ++中读取大量文本文件？

3 个答案: