我正在尝试使用GNU iconv
库将UTF-8编码的字符串转换为KOI8-R。我的最小例子是
#include <iconv.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
/* The letter П in UTF-8. */
char* buffer = "\xd0\x9f";
size_t len = 2;
/* Note: since KOI8-R is an 8-bit encoding, the buffer should only need a length of 1, but
* iconv returns -1 if the buffer is any smaller than 4 bytes,
*/
size_t len_in_koi = 4;
char* buffer_in_koi = malloc(len_in_koi+1);
/* A throwaway copy to give to iconv. */
char* buffer_in_koi_copy = buffer_in_koi;
iconv_t cd = iconv_open("UTF-8", "KOI8-R");
if (cd == (iconv_t) -1) {
fputs("Error while initializing iconv_t handle.\n", stderr);
return 2;
}
if (iconv(cd, &buffer, &len, &buffer_in_koi_copy, &len_in_koi) != (size_t) -1) {
/* Expecting f0 but get d0. */
printf("Conversion successful! The byte is %x.\n", (unsigned char)(*buffer_in_koi));
} else {
fputs("Error while converting buffer to KOI8-R.\n", stderr);
return 3;
}
iconv_close(cd);
free(buffer_in_koi);
return 0;
}
(除了我的KOI8-R缓冲区小于4个字节时不工作,虽然它只需要一个字节)不正确地打印d0
(KOI8-R中'П'
的正确编码是f0
)。
iconv
从命令行给出了正确答案(例如echo П | iconv -t KOI8-R | hexdump
),那么在使用C接口时我做错了什么?
答案 0 :(得分:4)
你把&#34;混合到了#34; &#34;来自&#34; iconv_open
的字符集参数。恰好,KOI8-R中的插槽D0
中的字符具有D0
作为其UTF-8编码的第一个字节。