关于问题How to replace/ignore invalid Unicode/UTF8 characters � from C stdio.h getline()?,我为这个问题提供了一种可能的解决方案,但是我没有设法使其正常工作。
这是完整的示例:
FILE* cfilestream = fopen( "/filepath.txt", "r" );
int linebuffersize = 131072;
char* readline = (char*) malloc( linebuffersize );
char* fixedreadline = (char*) malloc( linebuffersize );
int index;
int charsread;
int invalidcharsoffset;
while( true )
{
if( ( charsread = getline( &readline, &linebuffersize, cfilestream ) ) != -1 )
{
invalidcharsoffset = 0;
for( index = 0; index < charsread; ++index )
{
if( readline[index] != '�' ) {
fixedreadline[index-invalidcharsoffset] = readline[index];
}
else {
++invalidcharsoffset;
}
}
std::cerr << "fixedreadline=" << fixedreadline << std::endl;
}
else {
break;
}
}
编译时,出现以下警告:
$ x86_64-linux-gnu-gcc -g -O0 -Wall -ggdb -std=c++11
source/fastfile.cpp:512:44: warning: multi-character character constant [-Wmultichar]
if( readline[index] != '�' ) {
^~~~~
并且在运行程序时,它不会从输入字符串Føö�Bår
中删除�字符。