当我使用函数readTheNRow with row = 0(我读第一行)时,我发现三个第一个字符是\ 357,\ 273和\ 277。我发现这个前缀是与UTF-8文件有关的一些,但有些文件有这个前缀,有些文件没有:(。我如何忽略我想从中读取的文件中所有类型的此类前缀?< / p>
int readTheNRow(char buff[], int row) {
int file = open("my_file.txt", O_RDONLY);
if (file < 0) {
write(2, "closing fifo was unsuccessful\n", 31);
exit(-1);
}
// function's variables
int i = 0;
char ch; // a temp variable to read with it
int check; // helping variable for checking the read function
// read till we reach the needed row
while (i != row) {
// read one char
check = read(file, &ch, 1);
if (check < 0) {
// write a error message to the user
write(2, "error occurred in reading\n", 27);
exit(-1);
}
if (check < 0) {
// if means that we reached the end of file
return -1; // couldn't read the N row (N is bigger than X)
}
printf("%c",ch);
// check that the char is a \n
if (ch == '\n') {
i++;
}
}
// read the number to the received buffer
i = 0;
do {
// read one char
check = read(file, buff + i, 1);
if (check < 0) {
// write a error message to the user
write(2, "error occurred in reading\n", 27);
exit(-1);
}
// if we reached the end of file
if (check == 0) {
break;
}
i++;
} while (buff[i - 1] != '\n');
// put the \0 in the end of the string
buff[i - 1] = '\0';
return 1; // return that reading was successful
// try to close the file
if (close(file) < 0) {
write(2, "closing fifo was unsuccessful\n", 31);
exit(-1);
}
}
答案 0 :(得分:6)
您似乎正在尝试读取带有所谓BOM(字节订购标记)的文件。
测试这些前缀,如果它们周围使用了潜在信息,则继续读取文件,将其解释为BOM表示。
序列\357 \273 \277
表示UTF-8正在跟随。这不需要考虑字节顺序,因为字节是这些文件的单位。
此处有关各种现有物料清单的更多信息:http://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding