在C中跳过文本文件的读取标题

时间:2017-08-17 11:06:33

标签: c file-io

我正在阅读文件orderedfile.txt中的数据。有时此文件具有以下形式的标题:

BEGIN header

       Real Lattice(A)               Lattice parameters(A)    Cell Angles
   2.4675850   0.0000000   0.0000000     a =    2.467585  alpha =   90.000000
   0.0000000  30.0000000   0.0000000     b =   30.000000  beta  =   90.000000
   0.0000000   0.0000000  30.0000000     c =   30.000000  gamma =   90.000000

 1                            ! nspins
25   300   300                ! fine FFT grid along <a,b,c>
END header: data is "<a b c> pot" in units of Hartrees

 1     1     1            0.042580
 1     1     2            0.049331
 1     1     3            0.038605
 1     1     4            0.049181

有时没有标题,数据从第一行开始。我在数据中读取的代码如下所示。当数据从第一行开始时,它可以工作,但不存在标题。有办法解决这个问题吗?

int readinputfile() {
   FILE *potential = fopen("orderedfile.txt", "r");
   for (i=0; i<size; i++) {
      fscanf(potential, "%lf %lf %*f  %lf", &x[i], &y[i], &V[i]);
   }
   fclose(potential);
}

3 个答案:

答案 0 :(得分:2)

检查fscanf的返回值。如果它返回三,你的输入是正确的;否则,你仍然在标题中,所以你必须跳过这一行:

int readinputfile() {
    FILE *potential = fopen("orderedfile.txt", "r");
    int res;
    while(res = fscanf(potential, "%lf %lf %*f %lf", &x[i], &y[i], &V[i])) {
        if (res != 3) {
            fscanf(potential, "%*[^\n]");
            continue;
        }
        i++;
        ... // Optionally, do anything else with the data that you read
    }
    fclose(potential);
}

Demo.

答案 1 :(得分:2)

以下代码将使用fgets()来读取每一行。对于每一行,sscanf()用于扫描字符串并将其存储为双变量 查看正在投放的example (with stdin) at ideone

#include <stdio.h>

int main()
{
   /* maybe the buffer must be greater */
   char lineBuffer[256];
   FILE *potential = fopen("orderedfile.txt", "r");

   /* loop through every line */
   while (fgets(lineBuffer, sizeof(lineBuffer), potential) != NULL)
   {
      double a, b, c;
      /* if there are 3 items matched print them */
      if (3 == sscanf(lineBuffer, "%lf %lf %*f %lf", &a, &b, &c))
      {
         printf("%f %f %f\n", a, b, c);
      }
   }
   fclose(potential);

   return 0;
}

它正在使用您提供的标题,但如果在标题中有一行,例如:

 1     1     2            0.049331

会出现,然后也会读取此行。另一种可能性是,如果END header存在于您的指定标题中,则搜索单词BEGIN header;如果已知行数,则使用行计数。
要搜索子字符串,可以使用函数strstr()

答案 2 :(得分:2)

我认为显式查找标头的开头和结尾比依赖标头中没有匹配scanf()样式格式字符串的字符串更可靠:

FILE *fp = fopen(...);

int inHeader = 0;

size_t lineLen = 128;
char *linePtr = malloc( lineLen );

// skip header lines
while ( getline( &linePtr, &lineLen, fp ) >= ( ssize_t ) 0 )
{
    // check for the start of the header (need to do this first to
    // catch the first line)
    if ( !inHeader )
    {
        inHeader = !strncmp( linePtr, "BEGIN header", strlen( "BEGIN header" ) );
    }
    else
    {
        // if we were in the header, check for the end line and go to next line
        inHeader = strncmp( linePtr, "END header", strlen( "END header" ) );

        // need to skip this line no matter what because it's in the header
        continue;
    }

    // if we're not in the header, either break this loop
    // which leaves the file at the first non-header line,
    // or process the line in this loop
    if ( !inHeader )
    {
        ...
    }
}
...

您可能更喜欢使用strstr()代替strncmp()。这样,标题开头/结尾字符串不必开始该行。