在C中替换特殊字符'''

时间:2017-06-05 14:32:04

标签: c

我有一个要解析的字符串,它可能包含符号“& acute;”取决于用户数据。所以我想消除这个符号并用'替换它。 示例:

之前 - >不会再被愚弄 之后 - >不会再被愚弄

这是我的尝试,但它不会以任何方式工作..

<p>

4 个答案:

答案 0 :(得分:3)

您正在使用字符串文字,即const char,因此您无法对其进行修改。

您可以将其更改为

char permenant[]="Won&acute;t Get Fooled Again";

它将生成一个用文字字符串的字符初始化的char数组。

旁注:s指针完全没用。您可以直接在permanent执行检查。

答案 1 :(得分:3)

您可能需要查看

char *strstr(const char *haystack, const char *needle)

它返回指针needle第一次显示在haystack中的指针。只需获取返回指针并将替换字符写入其中。

PS:它是<string.h>

的一部分

答案 2 :(得分:0)

好的,您的代码已经指出了几个问题。

回顾一下:

  • 你不能修改这样表达的字符串,你需要一个字符数组。
  • 您应该使用strstr()来查找子字符串,而不是手写的比较链。
  • 在进行替换时,您应该使用memmove()压缩字符串。

这是我的尝试:

void replaceShorten(char *s, const char *what, const char *withWhat)
{
  char *hit;
  const size_t wLen = strlen(what);
  const size_t wwLen = strlen(withWhat);
  if(wwLen >= wLen)
    return;
  while((hit = strstr(s, what)) != NULL)
  {
    memcpy(hit, withWhat, wwLen);
    memmove(hit + wwLen, hit + wLen, strlen(hit + wLen) + 1);
    s = hit + wwLen;
  }
}

这未针对最大速度进行优化(它可以使用较少的strlen()调用),但可能有些明确。它似乎工作,如果这样使用:

char s[] = "Don&acute;t do it!";
replaceShorten(s, "&acute;", "'");

请注意,在进行替换时,必须缩短字符串,顾名思义。不支持任意替换。

答案 3 :(得分:0)

就个人而言,我将其分解如下:

  • 首先,尝试修改输入字符串;相反,将转换后的字符串写入不同的缓冲区;
  • 其次,将您的操作分解为不同的功能 - 一个用于查找字符串中的实体,一个用于将实体映射到替换字符串,一个用于执行替换,等等;
  • 第三,使用查找表或其他东西,而不是尝试匹配单个字符。

这是我头脑中的一个传递 - 它无疑有一些丑陋的错误,但它应该能让你了解我想要做的事情:

#include <stdio.h>
#include <string.h>

/**
 * Create a structure to map an entity name onto a replacement string...
 */
struct entity_lookup {
   char *entity_name;
   char *replacement;
};

/**
 * ... and use that structure to build the lookup table
 */
static const struct entity_lookup lookup_table[] =  {
  { "acute", "'" },
  { "amp", "&" },
  { "apos", "'"},
  { "lt", "<" },
  { "gt", ">" },
  { "nbsp", " " },
  { NULL, NULL }
};

/**
 * Scan the input string for the next entity, which begins with the
 * character '&' and ends with ';' - if we don't find the trailing
 * ';', then we treat the '&' as a literal character.  Returns the location
 * of the first character of the entity name in the string.  
 */
char *getNextEntity( const char *str, char **start, size_t *len )
{
  *len = 0;
  *start = strchr( str, '&' );

  if ( *start )
  {
    *start = *start + 1; // we want the text *following* &
    const char *end = strchr( *start, ';' );
    if ( end )
    {
      *len = end - *start - 1;
    }
    else
    {
      *start = NULL;
      *len = 0;
    }
  }

  return *start;
 }

/**
 * Find the replacement string for the given entity name.  Since our lookup
 * table is so small, a linear search is just fine in this case.
 */
char *getEntityReplacement( const char *entity )
{
  const struct entity_lookup *entry = lookup_table;

  while ( entry->entity_name != NULL && strcmp( entity, entry->entity_name ) )
  {
    entry++;
  }

  return entry->replacement;
}

/**
 * Replace any and all entities in an input string.  This code tries
 * to avoid any buffer overflows, but it's not very pretty and I haven't
 * exercised it that well.  I'm sure I could come up with a more elegant
 * method, but I've spent enough time on this already.
 */
void replaceEntities( const char * restrict input, char * restrict output, size_t maxOutputLen )
{
  char *entityStart = NULL;
  size_t entityLen = 0;

  /**
   * Initially point to the beginning of the input string.
   */
  const char *current = input;

  /**
   * Initially write an empty string to the output buffer.
   */
  *output = 0;

  /**
   * Look for the next entity; if we find one, copy everything from
   * the current position in the input buffer up to (but not including) 
   * the first character of the entity; then we copy the replacement
   * for the entity to the output buffer.
   */
  while ( getNextEntity( current, &entityStart, &entityLen ) )
  {
    if ( strlen( output ) + entityStart - current - 1 <  maxOutputLen - 1 )
    {
      /**
       * Copy everything from the current position to the start of the
       * entity to the output buffer; for example, copy "Won" to 
       * the output.
       */
      strncat( output, current, entityStart - current - 1 );

      /**
       * Find the entity in the lookup table.
       */
      char entityText[20] = {0};
      strncpy( entityText, entityStart, entityLen + 1 );

      char *repl = getEntityReplacement( entityText );

      /**
       * If there's a match and there's room in the output buffer,
       * write the replacement text to the output buffer, e.g., "'"
       */
      if ( repl && strlen( output ) + strlen( repl ) < maxOutputLen - 1 )
        strcat( output, repl );
      else
      {
        output[maxOutputLen] = 0;
        return;
      }
      current = entityStart + entityLen + 2;
    }
    else
    {
      output[maxOutputLen] = 0;
      return;
    }
  }

  /**
   * If we don't find any more entities, write the remainder of the input
   * string to the output buffer, e.g., "t Get Fooled Again"
   */
  if ( strlen( output ) + strlen( current ) < maxOutputLen - 1 )
    strcat( output, current );
}

int main( int argc, char **argv )
{
  char output[128];

  replaceEntities( argv[1], output, sizeof output );

  printf( "Original: %s\n", argv[1] );
  printf( "Stripped: %s\n", output );

  return 0;
}

与往常一样,证据在于布丁 - 这里有一些示例运行:

$ ./entities "Won&acute;t Get Fooled Again"
Original: Won&acute;t Get Fooled Again
Stripped: Won't Get Fooled Again

$ ./entities "Won&acute;t&nbsp;Get&nbsp;Fooled&nbsp;Again"
Original: Won&acute;t&nbsp;Get&nbsp;Fooled&nbsp;Again
Stripped: Won't Get Fooled Again

$ ./entities "Black &amp; Blue"
Original: Black &amp; Blue
Stripped: Black & Blue

$ ./entities "Black & Blue"
Original: Black & Blue
Stripped: Black & Blue

$ ./entities "#include &lt;stdio.h&gt;"
Original: #include &lt;stdio.h&gt;
Stripped: #include <stdio.h>

此代码对于生产使用来说太脆弱了,但同样,它应该会给你一些想法。