Question

我有以下用于规范化char数组的代码。在该过程结束时，规范化文件最后会留下一些旧的输出。这样做i到达j之前的数组末尾。这是有道理的，但如何删除多余的字符？我来自java，所以如果我犯了一些看似简单的错误，我会道歉。我有以下代码：

/* The normalize procedure normalizes a character array of size len 
   according to the following rules:
     1) turn all upper case letters into lower case ones
     2) turn any white-space character into a space character and, 
        shrink any n>1 consecutive whitespace characters to exactly 1 whitespace

     When the procedure returns, the character array buf contains the newly 
     normalized string and the return value is the new length of the normalized string.

     hint: you may want to use C library function isupper, isspace, tolower
     do "man isupper"
*/
int
normalize(unsigned char *buf,   /* The character array contains the string to be normalized*/
                    int len     /* the size of the original character array */)
{
    /* use a for loop to cycle through each character and the built in c funstions to analyze it */
    int i = 0;
    int j = 0;
    int k = len;

    if(isspace(buf[0])){
        i++;
        k--;
    }
    if(isspace(buf[len-1])){
        i++;
        k--;
    }
    for(i;i < len;i++){
        if(islower(buf[i])) {
            buf[j]=buf[i];
            j++;
        }
        if(isupper(buf[i])) {
            buf[j]=tolower(buf[i]);
            j++;
        }
        if(isspace(buf[i]) && !isspace(buf[j-1])) {
            buf[j]=' ';
            j++;
        }
        if(isspace(buf[i]) && isspace(buf[i+1])){
            i++;
            k--;
        }
    }

   return k;

}

以下是一些示例输出：

halb mwqcnfuokuqhuhy ja mdqu nzskzkdkywqsfbs zwb lyvli HALB MwQcnfuOKuQhuhy Ja mDQU nZSkZkDkYWqsfBS ZWb lyVLi

正如您所看到的那样，结束部分正在重复。新的标准化数据和旧的未标准化数据都存在于结果中。我该如何解决这个问题？

Answer 1

添加空终结符

k[newLength]='\0';
return k;

Answer 2

像这样修复

int normalize(unsigned char *buf, int len) {
    int i, j;

    for(j=i=0 ;i < len; ++i){
        if(isupper(buf[i])) {
            buf[j++]=tolower(buf[i]);
            continue ;
        }
        if(isspace(buf[i])){
            if(!j || j && buf[j-1] != ' ')
                buf[j++]=' ';
            continue ;
        }
        buf[j++] = buf[i];
    }
    buf[j] = '\0';

    return j;
}

Answer 3

或？添加一个空终止符

    k[newLength] = NULL;
    return k;

规范化char数组并删除多余的字符（截断）

3 个答案: