是否有常用的常规或库?
e.g。 '
必须成为'
。
答案 0 :(得分:1)
我尝试从字符串中解析出数字,然后使用atoi
将其转换为数字,然后将其转换为字符。
这是我在约20秒内写的东西所以它是完全做作的:
char html[] = "'";
char* pch = &html[2];
int n = 0;
char c = 0;
pch[2] = '\0';
n = atoi(pch);
c = n;
现在c是'
。我也不知道html字符串......所以我可能会遗漏一些东西
答案 1 :(得分:1)
假设您仅关心&#xx;
样式实体,这并不是特别难。简单,让所有人 - 其他 - 担心记忆管理,机械,真正的正则表达方式:
int hex_to_value(char hex) {
if (hex >= '0' && hex <= '9') { return hex - '0'; }
if (hex >= 'A' && hex <= 'F') { return hex - 'A' + 10; }
if (hex >= 'a' && hex <= 'f') { return hex - 'f' + 10; }
return -1;
}
void unescape(char* dst, const char* src) {
// Write the translated version of the text at 'src', to 'dst'.
// All sequences of '&#xx;', where x is a hex digit, are replaced
// with the corresponding single byte.
enum { NONE, AND, AND_HASH, AND_HASH_EX, AND_HASH_EX_EX } mode;
char first_hex, second_hex, translated;
mode m = NONE;
while (*src) {
char c = *src++;
switch (m) {
case NONE:
if (c == '&') { m = AND; }
else { *dst++ = c; m = NONE; }
break;
case AND:
if (c == '#') { m = AND_HASH; }
else { *dst++ = '&'; *dst++ = c; m = NONE; }
break;
case AND_HASH:
translated = hex_to_value(c);
if (translated != -1) { first_hex = c; m = AND_HASH_EX; }
else { *dst++ = '&'; *dst++ = '#'; *dst++ = c; m = NONE; }
break;
case AND_HASH_EX:
translated = hex_to_value(c);
if (translated != -1) {
second_hex = c;
translated = hex_to_value(first_hex) << 4 | translated;
m = AND_HASH_EX_EX;
} else {
*dst++ = '&'; *dst++ = '#'; *dst++ = first_hex; *dst++ = c;
m = NONE;
}
break;
case AND_HASH_EX_EX:
if (c == ';') { *dst++ = translated; }
else {
*dst++ = '&'; *dst++ = '#';
*dst++ = first_hex; *dst++ = second_hex; *dst++ = c;
}
m = NONE;
break;
}
}
}
乏味,而且代码比看起来更合理,但并不难:)
答案 2 :(得分:1)
有“GNU recode” - 命令行程序和库。 http://recode.progiciels-bpi.ca/index.html
除此之外,它还可以对HTML字符进行编码/解码。