阅读和阅读的最佳方式存储此输入?

时间:2017-12-03 06:34:11

标签: c

我有一个名为animals.dat的输入文件,其中的数据格式如下:

1,Allegra,Pseudois nayaur,S,5 
2,unknown,Ailurus fulgens,X,10
3,Athena,Moschus fuscus,X,2

我用来存储和处理数据的代码是这样的。但是,由于某种原因,它似乎陷入无限循环。关于如何使其正确/更好的任何建议?

void choice3(FILE *infile) {
    int id;
    printf("Enter ID ");
    scanf("%d", &id);
    while(!feof(infile)) { 
        int animalID;
        char animalName[20];
        char animalType[20];
        char animalSize;
        int animalAge;
        fscanf(infile,"%d,",&animalID);
        fscanf(infile,"%[^,] ",animalName);
        fscanf(infile,"%s, %c, %d",animalType,&animalSize,&animalAge);
        if(animalID == id) {
            printf("Animal Found");  
        }
    }
    rewind(infile);

编辑:

这是我必须作为输入的确切二进制文件的链接。 https://drive.google.com/open?id=18olXBhRgpGyY0bhpjDSwla2XcBnWoFGM

而且,我对这部分的指示说 “所有动物按其身份编号按递增顺序列出,从值1开始。如果身份证号码有漏洞, 例如2,然后结构信息仍然存在于文件中,除了name组件包含字符串 “未知”表示空记录。确保您的搜索使用随机文件处理。 如果输入了无效的id,在本例中为1或3以外的任何值,则程序将显示错误消息。 否则,显示动物记录。在任何一种情况下,程序都将返回初始菜单。“

我已按照给定顺序更新了此代码。

void choice3(FILE *infile) {
    Animal tempAnimal;
    int id;
    printf("Enter ID ");
    scanf(" %d", &id);
    fseek(infile,id * sizeof(struct animal),SEEK_SET);
    fread(&tempAnimal,sizeof(struct animal),1,infile);
    printf("%d -- %s\n",tempAnimal->id,tempAnimal->name);
}

我已经在另一个文件animal.h中定义了结构。

struct animal {
    short int id;
    char name[20];
    char species[35];
    char size;
    short int age;
};
typedef struct animal* Animal;

但是,由于某些原因,我现在得到了“Segmentation fault:11”。这意味着它不适用于我的fread()行。有什么建议吗?

2 个答案:

答案 0 :(得分:3)

这段代码对我有用 - 我已经把它变成了接近MCVE的东西(Minimal, Complete, Verifiable Example

#include <stdio.h>

static void choice3(FILE *infile, int id)
{
    int animalID = -37;
    char animalName[20];
    char animalType[20];
    char animalSize;
    int animalAge;
    while (fscanf(infile, "%d , %19[^,] , %19[^,] , %c , %d",
                  &animalID, animalName, animalType, &animalSize, &animalAge) == 5)
    {
        printf("Read: %d: %s, %s, %c, %d\n",
               animalID, animalName, animalType, animalSize, animalAge);
        if (animalID == id)
        {
            printf("Animal Found: %d: %s, %s, %c, %d\n",
                   animalID, animalName, animalType, animalSize, animalAge);
        }
    }
    if (feof(infile))
        printf("EOF\n");
    else
        printf("Format error\n");
}

int main(void)
{
    choice3(stdin, 3);
    return 0;
}

它将所需的动物ID硬连接到3,并从标准输入读取,因此我在您的数据文件(csv47)上运行程序(data)并得到:

$ ./csv47 < data
Read: 1: Allegra, Pseudois nayaur, S, 5
Read: 2: unknown, Ailurus fulgens, X, 10
Read: 3: Athena, Moschus fuscus, X, 2
Animal Found: 3: Athena, Moschus fuscus, X, 2
EOF
$

并非fscanf()格式字符串中的所有空格都是必需的;没有人有害。请注意,代码检查正确的字段数并退出循环。请注意,打印数据以便清除所读取的内容 - 这是一种基本的调试技术。循环后的测试是feof()的正确使用;使用feof()来控制循环几乎总是错误的。

您最好使用行读取功能(例如fgets()或POSIX getline())来读取一行数据,然后您可以打印,扫描,重新扫描,报告造成麻烦的线路。这通常会导致更好的错误报告,如果只是因为您有整个行可用,而不是在fscanf()读取了一些但不是所有字段后留下的任何片段。

另请注意,这不会处理包含逗号或其他一些标准CSV约定的双引号中的字段。那些确实需要库代码来处理读数。

最后,这种编辑只关注将数据读入局部变量并避免“无限循环”。有关存储的讨论,请参阅David C. Rankinanswer

来自Google云端硬盘

animals.dat

Google Drive在2017-12-03 19:00:00 -08:00提供的animals.dat文件是使用小端整数(Intel机器)编写的二进制文件问题中概述的60字节结构(并在下面的打印代码中使用)。如果它不可用,以下是xxd -i animals.dat的输出 - 包含相同数据的C数组定义:

unsigned char animals_dat[] = {
  0x01, 0x00, 0x41, 0x62, 0x69, 0x67, 0x61, 0x69, 0x6c, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x43, 0x61,
  0x70, 0x72, 0x69, 0x63, 0x6f, 0x72, 0x6e, 0x69, 0x73, 0x20, 0x73, 0x75,
  0x6d, 0x61, 0x74, 0x72, 0x61, 0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x53, 0x08, 0x00,
  0x02, 0x00, 0x75, 0x6e, 0x6b, 0x6e, 0x6f, 0x77, 0x6e, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x4f, 0x72,
  0x79, 0x78, 0x20, 0x6c, 0x65, 0x75, 0x63, 0x6f, 0x72, 0x79, 0x78, 0x00,
  0x6d, 0x61, 0x74, 0x72, 0x61, 0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4d, 0x0c, 0x00,
  0x03, 0x00, 0x41, 0x64, 0x72, 0x69, 0x61, 0x6e, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x43, 0x65,
  0x70, 0x68, 0x61, 0x6c, 0x6f, 0x70, 0x68, 0x75, 0x73, 0x20, 0x64, 0x6f,
  0x72, 0x73, 0x61, 0x6c, 0x69, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4c, 0x10, 0x00,
  0x04, 0x00, 0x41, 0x68, 0x6d, 0x65, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x4e, 0x61,
  0x65, 0x6d, 0x6f, 0x72, 0x68, 0x65, 0x64, 0x75, 0x73, 0x20, 0x67, 0x72,
  0x69, 0x73, 0x65, 0x75, 0x73, 0x00, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4c, 0x0a, 0x00,
  0x05, 0x00, 0x41, 0x69, 0x64, 0x61, 0x6e, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x4e, 0x61,
  0x65, 0x6d, 0x6f, 0x72, 0x68, 0x65, 0x64, 0x75, 0x73, 0x20, 0x63, 0x61,
  0x75, 0x64, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x58, 0x09, 0x00,
  0x06, 0x00, 0x41, 0x6c, 0x6c, 0x65, 0x67, 0x72, 0x61, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x50, 0x73,
  0x65, 0x75, 0x64, 0x6f, 0x69, 0x73, 0x20, 0x6e, 0x61, 0x79, 0x61, 0x75,
  0x72, 0x00, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x53, 0x05, 0x00,
  0x07, 0x00, 0x41, 0x6d, 0x65, 0x6c, 0x61, 0x00, 0x61, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x43, 0x65,
  0x72, 0x64, 0x6f, 0x63, 0x79, 0x6f, 0x6e, 0x20, 0x74, 0x68, 0x6f, 0x75,
  0x73, 0x00, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4d, 0x0b, 0x00,
  0x08, 0x00, 0x75, 0x6e, 0x6b, 0x6e, 0x6f, 0x77, 0x6e, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x43, 0x61,
  0x70, 0x72, 0x61, 0x20, 0x66, 0x61, 0x6c, 0x63, 0x6f, 0x6e, 0x65, 0x72,
  0x69, 0x00, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4d, 0x01, 0x00,
  0x09, 0x00, 0x41, 0x6e, 0x6a, 0x6f, 0x6c, 0x69, 0x65, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x41, 0x69,
  0x6c, 0x75, 0x72, 0x75, 0x73, 0x20, 0x66, 0x75, 0x6c, 0x67, 0x65, 0x6e,
  0x73, 0x00, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4c, 0x0a, 0x00,
  0x0a, 0x00, 0x41, 0x74, 0x68, 0x65, 0x6e, 0x61, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x4d, 0x6f,
  0x73, 0x63, 0x68, 0x75, 0x73, 0x20, 0x66, 0x75, 0x73, 0x63, 0x75, 0x73,
  0x00, 0x00, 0x61, 0x74, 0x75, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x53, 0x05, 0x00,
  0x0b, 0x00, 0x41, 0x76, 0x61, 0x00, 0x6e, 0x61, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x43, 0x65,
  0x70, 0x68, 0x61, 0x6c, 0x6f, 0x70, 0x68, 0x75, 0x73, 0x20, 0x6a, 0x65,
  0x6e, 0x74, 0x69, 0x6e, 0x6b, 0x69, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4d, 0x0d, 0x00,
  0x0c, 0x00, 0x41, 0x78, 0x65, 0x6c, 0x00, 0x61, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x48, 0x69,
  0x70, 0x70, 0x6f, 0x63, 0x61, 0x6d, 0x65, 0x6c, 0x75, 0x73, 0x20, 0x61,
  0x6e, 0x74, 0x69, 0x73, 0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x4d, 0x0b, 0x00,
  0x0d, 0x00, 0x41, 0x79, 0x61, 0x6e, 0x6e, 0x61, 0x00, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x47, 0x61,
  0x7a, 0x65, 0x6c, 0x6c, 0x61, 0x20, 0x63, 0x75, 0x76, 0x69, 0x65, 0x72,
  0x69, 0x00, 0x69, 0x73, 0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x53, 0x0c, 0x00,
  0x0e, 0x00, 0x42, 0x72, 0x61, 0x64, 0x6c, 0x65, 0x79, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x42, 0x75,
  0x62, 0x61, 0x6c, 0x75, 0x73, 0x20, 0x6d, 0x69, 0x6e, 0x64, 0x6f, 0x72,
  0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x58, 0x04, 0x00,
  0x0f, 0x00, 0x42, 0x72, 0x65, 0x6e, 0x64, 0x61, 0x6e, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x74, 0x01, 0x00, 0x00, 0x74, 0x01, 0x42, 0x6f,
  0x73, 0x20, 0x67, 0x61, 0x75, 0x72, 0x75, 0x73, 0x00, 0x64, 0x6f, 0x72,
  0x65, 0x6e, 0x73, 0x69, 0x73, 0x00, 0x73, 0x69, 0x73, 0x00, 0x00, 0x00,
  0x5c, 0x3f, 0x1b, 0x00, 0x5c, 0x4f, 0x1b, 0x00, 0x5c, 0x58, 0x01, 0x00
};
unsigned int animals_dat_len = 900;

读取二进制animals.dat

的代码

正如我在下面的评论中所指出的,数据文件泄漏信息,因为在每个名称之后都有来自前一记录的垃圾数据。这是strncpy()的空填充行为实际上变得有用的那种情况之一;它使用空字节来压缩先前记录中的无关数据,但在生成animals.dat时,这显然没有完成。

#include <stdio.h>
#include <string.h>
#include <ctype.h>

struct animal {
    short int id;
    char name[20];
    char species[35];
    char size;
    short int age;
};

static void debris_field(const char *tag, const char *field, size_t length)
{
    size_t nomlen = strlen(field);
    int count = 0;
    for (size_t i = nomlen; i < length; i++)
    {
        if (field[i] != '\0')
        {
            if (count == 0)
                printf("%8s (%2zu = %-20s) has debris:\n        ", tag, nomlen, field);
            count++;
            unsigned char u = field[i];
            if (isprint(u))
                putchar(u);
            else
                printf("\\x%.2X", u);
        }
    }
    if (count != 0)
        putchar('\n');
}

static void report_debris(const struct animal *info)
{
    debris_field("name", info->name, sizeof(info->name));
    debris_field("species", info->species, sizeof(info->species));
}

static void choice2(FILE *infile, int noisy)
{
    struct animal info;
    while (fread(&info, sizeof(info), 1, infile) == 1)
    {
        if (strcmp(info.name, "unknown") == 0)
        {
            printf("Deleted: %2d %20s %30s %c %2d\n", info.id, info.name, info.species, info.size, info.age);
        }
        else
        {
            printf("Current: %2d %20s %30s %c %2d\n", info.id, info.name, info.species, info.size, info.age);
        }
        if (noisy)
            report_debris(&info);
    }
}

int main(int argc, char **argv)
{
    int noisy = 0;
    if (argc > 1 && argv[argc] == 0)    // Use argv
        noisy = 1;
    choice2(stdin, noisy);
    return 0;
}

'use argv'评论是相关的,因为我在运行macOS High Sierra 10.13.1的MacBook Pro上用命令行(animals59.c中的源代码)编译GCC 7.2.0:

$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes \
>     -Wstrict-prototypes animals59.c -o animals59
$

如果代码没有以某种方式使用argv,编译器会抱怨并且代码无法编译。

输出 - 无参数

Current:  1              Abigail       Capricornis sumatraensis S  8
Deleted:  2              unknown                  Oryx leucoryx M 12
Current:  3               Adrian           Cephalophus dorsalis L 16
Current:  4                Ahmed            Naemorhedus griseus L 10
Current:  5                Aidan           Naemorhedus caudatus X  9
Current:  6              Allegra                Pseudois nayaur S  5
Current:  7                Amela                Cerdocyon thous M 11
Deleted:  8              unknown                Capra falconeri M  1
Current:  9              Anjolie                Ailurus fulgens L 10
Current: 10               Athena                 Moschus fuscus S  5
Current: 11                  Ava           Cephalophus jentinki M 13
Current: 12                 Axel        Hippocamelus antisensis M 11
Current: 13               Ayanna                Gazella cuvieri S 12
Current: 14              Bradley            Bubalus mindorensis X  4
Current: 15              Brendan                     Bos gaurus X  1

输出 - 带参数

Current:  1              Abigail       Capricornis sumatraensis S  8
    name ( 7 = Abigail             ) has debris:
        \x04t\x01t\x01
 species (24 = Capricornis sumatraensis) has debris:
        \?\x1B\O\x1B\
Deleted:  2              unknown                  Oryx leucoryx M 12
    name ( 7 = unknown             ) has debris:
        \x04t\x01t\x01
 species (13 = Oryx leucoryx       ) has debris:
        matraensis\?\x1B\O\x1B\
Current:  3               Adrian           Cephalophus dorsalis L 16
    name ( 6 = Adrian              ) has debris:
        \x04t\x01t\x01
 species (20 = Cephalophus dorsalis) has debris:
        sis\?\x1B\O\x1B\
Current:  4                Ahmed            Naemorhedus griseus L 10
    name ( 5 = Ahmed               ) has debris:
        \x04t\x01t\x01
 species (19 = Naemorhedus griseus ) has debris:
        sis\?\x1B\O\x1B\
Current:  5                Aidan           Naemorhedus caudatus X  9
    name ( 5 = Aidan               ) has debris:
        \x04t\x01t\x01
 species (20 = Naemorhedus caudatus) has debris:
        sis\?\x1B\O\x1B\
Current:  6              Allegra                Pseudois nayaur S  5
    name ( 7 = Allegra             ) has debris:
        \x04t\x01t\x01
 species (15 = Pseudois nayaur     ) has debris:
        atussis\?\x1B\O\x1B\
Current:  7                Amela                Cerdocyon thous M 11
    name ( 5 = Amela               ) has debris:
        a\x04t\x01t\x01
 species (15 = Cerdocyon thous     ) has debris:
        atussis\?\x1B\O\x1B\
Deleted:  8              unknown                Capra falconeri M  1
    name ( 7 = unknown             ) has debris:
        \x04t\x01t\x01
 species (15 = Capra falconeri     ) has debris:
        atussis\?\x1B\O\x1B\
Current:  9              Anjolie                Ailurus fulgens L 10
    name ( 7 = Anjolie             ) has debris:
        \x04t\x01t\x01
 species (15 = Ailurus fulgens     ) has debris:
        atussis\?\x1B\O\x1B\
Current: 10               Athena                 Moschus fuscus S  5
    name ( 6 = Athena              ) has debris:
        \x04t\x01t\x01
 species (14 = Moschus fuscus      ) has debris:
        atussis\?\x1B\O\x1B\
Current: 11                  Ava           Cephalophus jentinki M 13
    name ( 3 = Ava                 ) has debris:
        na\x04t\x01t\x01
 species (20 = Cephalophus jentinki) has debris:
        sis\?\x1B\O\x1B\
Current: 12                 Axel        Hippocamelus antisensis M 11
    name ( 4 = Axel                ) has debris:
        a\x04t\x01t\x01
 species (23 = Hippocamelus antisensis) has debris:
        \?\x1B\O\x1B\
Current: 13               Ayanna                Gazella cuvieri S 12
    name ( 6 = Ayanna              ) has debris:
        \x04t\x01t\x01
 species (15 = Gazella cuvieri     ) has debris:
        isensis\?\x1B\O\x1B\
Current: 14              Bradley            Bubalus mindorensis X  4
    name ( 7 = Bradley             ) has debris:
        \x04t\x01t\x01
 species (19 = Bubalus mindorensis ) has debris:
        sis\?\x1B\O\x1B\
Current: 15              Brendan                     Bos gaurus X  1
    name ( 7 = Brendan             ) has debris:
        \x04t\x01t\x01
 species (10 = Bos gaurus          ) has debris:
        dorensissis\?\x1B\O\x1B\

答案 1 :(得分:2)

简化需要被视为不同组(例如动物id, name, type, size and age)的若干信息的协调的一种方法是将这些信息捕获为struct。您可以使用struct数组来捕获所有动物的信息。这将简化您的数据收集,并允许您将所有值保存在内存中以进行查询等。

要阅读不同的数据,请注意单个动物的所有数据都在一行上。这应该指向一个面向行的输入函数,例如POSIX fgets的{​​{1}}。读取一行数据后,您可以从中解析所需的值。 (使用例如getline或向下行走指针)这提供了以下好处:(1)验证读取是独立的,除了(2)验证从行解析的各个值(以及消耗尾随{在下次阅读之前{1}}。

将它们放在一起,你可以简单地将你的动物数据读入一系列类似于以下内容的动物:

sscanf

示例使用/输出

'\n'

阅读Binary&#39; animals.dat&#39;

当输入文件格式发生显着变化,以及#include <stdio.h> /* consts for max name/type, animals array, characters for buf */ enum { MAXNT = 20, MAXA = 128, MAXC = 512 }; typedef struct { int id, age; char name[MAXNT], type[MAXNT], size; } animal; int main (int argc, char **argv) { int n = 0; /* array index */ char buf[MAXC] = ""; /* line buffer */ animal animals[MAXA] = {{ .id = 0 }}; /* animals array */ FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin; if (!fp) { /* validate file open for reading */ fprintf (stderr, "error: file open failed '%s'.\n", argv[1]); return 1; } while (n < MAXA && fgets (buf, MAXC, fp)) { /* read each line */ /* parse animal data from line and validate conversion */ if (sscanf (buf, "%d, %19[^,], %19[^,], %c, %d", &animals[n].id, animals[n].name, animals[n].type, &animals[n].size, &animals[n].age) == 5) n++; /* increment array index on successful conversion */ } if (fp != stdin) fclose (fp); /* close file if not stdin */ /* do what you need with data (printing here) */ for (int i = 0; i < n; i++) printf ("%3d %-20s %-20s %c %3d\n", animals[i].id, animals[i].name, animals[i].type, animals[i].size, animals[i].age); return 0; } $ ./bin/animals <dat/animals.dat 1 Allegra Pseudois nayaur S 5 2 unknown Ailurus fulgens X 10 3 Athena Moschus fuscus X 2 的大小时,这会显着改变问题的解决方法。在不重复Leffler先生所做的事情的情况下,让我们看一些其他方法来处理namestypes的{​​{1}}。

虽然使用传统的types表示法没有任何问题,但请注意id提供精确宽度类型,允许您指定age位整数/无符号值的宽度。这消除了类型大小的架构或编译器变化的任何可能性。 short int中提供了相应的精确宽度printf / scanf 格式说明符

在不知道二进制stdint.h文件中的内容的情况下,您可以自行检查内容以确定记录和单个变量大小以及 endianness 。 (我将其保存为8, 16, 32, 64 or 128以区别于您的第一个文件)。可用于检查文件字节的Linux工具包括inttypes.hanimals.dat等。 Windows可以在powershell中提供类似的转储,或者可以免费下载的旧备用WinHex Hex Editor [ 1 ] 中运行良好。没有什么神奇之处,你只需将文件转储到文件中,然后开始识别你能做什么并开始计数...,例如二进制animals.bin.dat的前两个记录是:

od

使用hexdump变量作为指导,您可以确定整数和字符串宽度。有了这些,您可以使用animals.dat一次读取记录$ hexdump -Cv dat/animals.bin.dat 00000000 01 00 41 62 69 67 61 69 6c 00 00 00 04 00 00 00 |..Abigail.......| 00000010 74 01 00 00 74 01 43 61 70 72 69 63 6f 72 6e 69 |t...t.Capricorni| 00000020 73 20 73 75 6d 61 74 72 61 65 6e 73 69 73 00 00 |s sumatraensis..| 00000030 5c 3f 1b 00 5c 4f 1b 00 5c 53 08 00 02 00 75 6e |\?..\O..\S....un| 00000040 6b 6e 6f 77 6e 00 00 00 04 00 00 00 74 01 00 00 |known.......t...| 00000050 74 01 4f 72 79 78 20 6c 65 75 63 6f 72 79 78 00 |t.Oryx leucoryx.| 00000060 6d 61 74 72 61 65 6e 73 69 73 00 00 5c 3f 1b 00 |matraensis..\?..| 00000070 5c 4f 1b 00 5c 4d 0c 00 03 00 41 64 72 69 61 6e |\O..\M....Adrian| ,并根据需要从id, name, type, size, age将适当的字节读取到各个变量。这个文件是一个很好的(错误示例),当您将包含字符串的固定长度数组写入尚未正确初始化的文件时会发生什么。垃圾留在 nul-terminator 后面的字符串和下一个数据的开头。这很可能是碎片来自于另一个答案中充分讨论过的。我只想说碎片让你的考试更具挑战性......

完成字节检查后,您应该能够执行以下操作,一次读取60个字节,然后从中提取fread值:

60-bytes

请注意,memcpyid, name, type, size, age使用#include <stdio.h> #include <string.h> #include <stdint.h> #include <inttypes.h> /* consts for max name, type, record size, max animals to read */ enum { MAXN = 20, MAXT = 35, RECSZ = 60, MAXA = 128 }; typedef struct { uint16_t id, age; char name[MAXN], type[MAXT], size; } animal; int main (int argc, char **argv) { int n = 0; /* array index */ animal animals[MAXA] = {{ .id = 0 }}; /* animals array */ FILE *fp = argc > 1 ? fopen (argv[1], "rb") : stdin; if (!fp) { /* validate file open for reading */ fprintf (stderr, "error: file open failed '%s'.\n", argv[1]); return 1; } while (n < MAXA) { /* read up to MAXA animal records */ uint8_t rec[RECSZ] = "", /* record buffer */ size = 0, /* size of member */ offset = 0; /* offset in record */ if (fread (rec, 1, RECSZ, fp) != RECSZ) /* read/validate rec */ break; size = sizeof animals[n].id; /* get id size */ memcpy (&animals[n].id, rec, size); /* copy from rec to id */ offset += size; /* add size to rec offset */ size = sizeof animals[n].name; /* repeat for each member */ memcpy (animals[n].name, rec + offset, size); offset += size; size = sizeof animals[n].type; memcpy (animals[n].type, rec + offset, size); offset += size; size = sizeof animals[n].size; memcpy (&animals[n].size, rec + offset, size); offset += size; size = sizeof animals[n].age; memcpy (&animals[n].age, rec + offset, size); n++; /* increment array index after copy */ } if (fp != stdin) fclose (fp); /* close file if not stdin */ /* do what you need with data (printing here) */ for (int i = 0; i < n; i++) printf ("%3" PRIu16 " %-20s %-35s %c %3" PRIu16 "\n", animals[i].id, animals[i].name, animals[i].type, animals[i].size, animals[i].age); return 0; } 16位无符号类型。另请注意uint16_t中使用的相应id 格式说明符。另请注意,格式说明符 格式字符串中的 包含在引号中。

上面的注释,当您读取字节时,您可以将agePRIu16参数反转为printf,并根据您的记录大小验证完整读数,而不是{{ 1}}。验证是相同的,但如果您正在捕获返回,它将返回读取的字节数而不是成员数。 (例如,您要么读取60个1字节成员,要么读取1个60字节成员,完全取决于您)

将用于读取二进制文件的新代码放在size数组中,如下所示:

示例使用/输出

nmemb

仔细看看,如果您有其他问题,请告诉我。

<强>脚注:

1。)与任何软件一样,在考虑加载之前,请先了解您从中获取的网站,验证校验和以及病毒扫描。如果您真的是偏执狂,请将其加载到虚拟机中并在将其带入生产环境之前执行完整诊断 - 但这可能是过度杀伤。