我无法将字符串从utf8转换为gb2312。我的转换功能低于
void convert(const char *from_charset,const char *to_charset, char *inptr, char *outptr)
{
size_t inleft = strlen(inptr);
size_t outleft = inleft;
iconv_t cd; /* conversion descriptor */
if ((cd = iconv_open(to_charset, from_charset)) == (iconv_t)(-1))
{
fprintf(stderr, "Cannot open converter from %s to %s\n", from_charset, to_charset);
exit(8);
}
/* return code of iconv() */
int rc = iconv(cd, &inptr, &inleft, &outptr, &outleft);
if (rc == -1)
{
fprintf(stderr, "Error in converting characters\n");
if(errno == E2BIG)
printf("errno == E2BIG\n");
if(errno == EILSEQ)
printf("errno == EILSEQ\n");
if(errno == EINVAL)
printf("errno == EINVAL\n");
iconv_close(cd);
exit(8);
}
iconv_close(cd);
}
这是我如何使用它的一个例子:
int len = 1000;
char *result = new char[len];
convert("UTF-8", "GB2312", some_string, result);
编辑:我大部分时间都会收到E2BIG错误。
答案 0 :(得分:4)
outleft应该是输出缓冲区的大小(例如1000字节),而不是传入字符串的大小。
转换时,字符串长度通常会在此过程中发生变化,您无法知道它之后会持续多长时间。 E2BIG意味着输出缓冲区不够大,在这种情况下你需要给它更多的输出缓冲区空间(注意它已经转换了一些数据并相应地调整了传递给它的四个变量)。
答案 1 :(得分:2)
正如其他人所说,E2BIG意味着输出缓冲区不足以进行转换,并且您使用了错误的值来进行转移。
但我也注意到你的功能还有其他一些问题。也就是说,通过函数的工作方式,调用者无法知道输出字符串中有多少字节。你的convert()函数既不会终止输出缓冲区,也不会告诉调用者它写入outptr的字节数。
如果你想处理nul-terminates字符串(并且看起来你想要做什么,因为你的输入字符串是nul-terminated),你可能会发现以下方法更好:
char *
convert (const char *from_charset, const char *to_charset, const char *input)
{
size_t inleft, outleft, converted = 0;
char *output, *outbuf, *tmp;
const char *inbuf;
size_t outlen;
iconv_t cd;
if ((cd = iconv_open (to_charset, from_charset)) == (iconv_t) -1)
return NULL;
inleft = strlen (input);
inbuf = input;
/* we'll start off allocating an output buffer which is the same size
* as our input buffer. */
outlen = inleft;
/* we allocate 4 bytes more than what we need for nul-termination... */
if (!(output = malloc (outlen + 4))) {
iconv_close (cd);
return NULL;
}
do {
errno = 0;
outbuf = output + converted;
outleft = outlen - converted;
converted = iconv (cd, (char **) &inbuf, &inleft, &outbuf, &outleft);
if (converted != (size_t) -1 || errno == EINVAL) {
/*
* EINVAL An incomplete multibyte sequence has been encoun-
* tered in the input.
*
* We'll just truncate it and ignore it.
*/
break;
}
if (errno != E2BIG) {
/*
* EILSEQ An invalid multibyte sequence has been encountered
* in the input.
*
* Bad input, we can't really recover from this.
*/
iconv_close (cd);
free (output);
return NULL;
}
/*
* E2BIG There is not sufficient room at *outbuf.
*
* We just need to grow our outbuffer and try again.
*/
converted = outbuf - out;
outlen += inleft * 2 + 8;
if (!(tmp = realloc (output, outlen + 4))) {
iconv_close (cd);
free (output);
return NULL;
}
output = tmp;
outbuf = output + converted;
} while (1);
/* flush the iconv conversion */
iconv (cd, NULL, NULL, &outbuf, &outleft);
iconv_close (cd);
/* Note: not all charsets can be nul-terminated with a single
* nul byte. UCS2, for example, needs 2 nul bytes and UCS4
* needs 4. I hope that 4 nul bytes is enough to terminate all
* multibyte charsets? */
/* nul-terminate the string */
memset (outbuf, 0, 4);
return output;
}