C语言的字谜测试仪

时间:2019-02-16 01:58:13

标签: c anagram

我正在尝试在C语言中实现一个字谜测试仪。调用该程序时,用户在双引号中输入两个单词,例如“听”和“沉默”。 我几乎已经可以使用它了,但是我编写的一个辅助函数在摆脱两个输入单词中的空格时遇到了一些麻烦。这是此功能的代码:

void noSpaces(char word[100]) {
    /*
    This is a function to get rid of spaces in a word
    It does this by scanning for a space and shifting the
    array elements at indices > where the space is
    down by 1 as long as there is still a space
    there. 
    */
    for (int i = 0; i < 100; i++) {
        while (word[i] == ' ') {
            for (int j = i; j < 100; j++) {
                word[j] = word[j+1];
            }
        }
    }
}

现在,当我将输入的单词从main函数传递给此帮助器时,它可以正常工作。问题是对该函数的第二次调用。当我在第二个输入上调用此函数时,如果k是第一个输入中的空格数,则该函数将擦除第二个输入的前k个字母。例如,输入./anagram " banana" "banana"会给我一个假否定,如果我添加一条打印语句以查看noSpaces之后的输入是怎么回事 调用它们,我得到以下信息:

banana
anana

这是完整程序的代码:

#include <stdio.h>

int main(int argc, char *argv[]) {
    //this if statement checks for empty entry
    if (isEmpty(argv[1]) == 0 || isEmpty(argv[2]) == 0) {
        //puts("one of these strings is empty");
        return 1;
    }
    //call to noSpaces to eliminate spaces in each word
    noSpaces(argv[1]);
    noSpaces(argv[2]);
    //call to sortWords
    sortWords(argv[1]);
    sortWords(argv[2]);
    int result = compare(argv[1], argv[2]);
    /*
    if (result == 1) {
        puts("Not anagrams");
    } else {
        puts("Anagrams");
    }
    */
    return result;
}

int compare(char word1[100], char word2[100]) {
    /*
    This is a function that accepts two sorted 
    char arrays (see 'sortWords' below) and
    returns 1 if it finds a different character
    at entry i in either array, or 0 if at no 
    index the arrays have a different character.
    */
    int counter = 0;
    while (word1[counter] != '\0' && word2[counter] != '\0') {
        if (word1[counter] != word2[counter]) {
            //printf("not anagrams\n");
            return 1;
        }
        counter++;
    }
    // printf("anagrams\n");
    return 0;
}

void sortWords(char word[100]) {
    /*
    This is a function to sort the input char arrays
    it's a simple bubble sort on the array elements.
    'sortWords' function accepts a char array and returns void,
    sorting the entries in alphabetical order
    being careful about ignoring the 'special character'
    '\0'.
    */
    for (int j = 0; j < 100; j++) {
        int i = 0;
        while (word[i + 1] != '\0') {
            if (word[i] > word[i + 1]) {
                char dummy = word[i + 1];
                word[i + 1] = word[i];
                word[i] = dummy;
            }
            i++;
        }
    }
}

void noSpaces(char word[100]) {
    /*
    This is a function to get rid of spaces in a word
    It does this by scanning for a space and shifting the
    array elements at indices > where the space is
    down by 1 as long as there is still a space there. 
    */
    for (int i = 0; i < 100; i++) {
        while (word[i] == ' ') {
            for (int j = i; j < 100; j++) {
                word[j] = word[j + 1];
            }
        }
    }
}

int isEmpty(char word[100]) {
    // if a word consists of the empty character, it's empty
    //otherwise, it isn't
    if (word[0] == '\0') {
        return 0;
    }
    return 1;
}

我知道有一个可以处理字符串的库,但是我确实 希望避免使用它。我已经走到了这一步,不需要它了,我觉得问题大部分已经解决了,但有一点我看不到。

我来自Java背景,如果这可以解释我所犯的任何错误,则我是C语言的新手。

4 个答案:

答案 0 :(得分:1)

您在辅助函数中犯了逻辑错误。您将从word[j]开始复制,而不是从第二个单词的开头开始复制,因此您将要去除与前导空格一样多的前导字符,就像在输出中看到的那样。

请注意,j=ii会计算外循环中前导空格的数量。

顺便说一句,您应该只有两个循环。将while条件放在第一个for循环中,如下所示:for (int i = 0; i<100 && word[i]==' '; i++)

要解决您的逻辑错误,您需要在最内层的循环中使用另一个初始化为零的迭代器k,并使用word[k] = word[j+1]。我认为那可以。

答案 1 :(得分:1)

在argv [1]缓冲区长度小于100的情况下,您在argv [1]和argv [2]上的缓冲区溢出有问题。因此,我认为您应该使用带strlen(word)的循环就足够了。当您在100 in for循环中使用静态长度时,有时该字会从另一个内存位置获取数据,并使您的程序处于未定义的状态。其他功能也有相同的问题。我的意思是 sortWords compare 函数。

这是我对您的noSpaces函数所做的修改,它应该可以工作。

void noSpaces(char word [100]){
    /*
    This is a function to get rid of spaces in a word
    It does this by scanning for a space and shifting the
    array elements at indices > where the space is
    down by 1 as long as there is still a space
    there.
    */
    for(int i =0; i<strlen(word)-1; i++){
        while(word[i]==' '){
            for(int j = i ; j<strlen(word); j++){
                word[j] = word [j+1];
            }
        }
    }
}

答案 2 :(得分:1)

不是尝试删除空格和排序,而是运行时间为O(N lg N)。您可以通过构建一个数组来表示一个单词中每个字母的计数来执行O(N)操作。并在执行此操作时忽略空格。

// Iterate over each character in the string
// For each char in string, increment the count of that character
// in the lettercount array.
// Return the number of unfiltered letters that were counted
int fillLetterCountTable(const char* string, int* lettercount)
{
    int len = strlen(string);
    int valid = 0;

    for (int i = 0; i < len; i++)
    {
       unsigned char index = (unsigned char)(string1[i]);
       if (index ==  ' ')  // ignore spaces
       {
           continue;
       }
       counts[index] += 1;
       valid++;
    }

    return valid;
}

// compare if two strings are anagrams of each other
// return true if string1 and string2 are anagrams, false otherwise
bool compare(const char* string1, const char* string2)
{
    int lettercount1[256] = {0};
    int lettercount2[256] = {0};

    int valid1 = fillLetterCountTable(string1, lettercount1);
    int valid2 = fillLetterCountTable(string2, lettercount2);

    if (valid1 != valid2)
        return false;

    // memcmp(lettercount1, lettercount2, sizeof(lettercount1));
    for (int i = 0; i < 256; i++)
    {
        if (counts1[i] != counts2[i])
            return false;
    }
    return true;
}

答案 3 :(得分:1)

在C语言中,字符串是char的数组,其终止符为空,即具有值0的字节,通常表示为'\0'。您不应假定任何特定的长度,例如100。实际上,编译器会忽略函数原型参数中的数组大小。您可以通过扫描空终止符来确定长度,这是strlen()高效地执行的操作,也可以以避免多次扫描的方式编写代码,并在空终止符处停止。您应该确保您的函数适用于空字符串,该字符串是一个具有单个空字节的数组。这是您的代码中的问题:

在函数noSpaces中,您将迭代到字符串的末尾,从而修改可能属于下一个字符串的内存。该程序具有未定义的行为。

您应该在字符串的结尾处停止。还可以使用2个索引变量在线性时间内执行:

void noSpaces(char word[]) {
    /*
    This is a function to get rid of spaces in a word
    It does this by scanning for a space and shifting the
    array elements at indices > where the space is
    down by 1 as long as there is still a space
    there. 
    */
    int i, j;
    for (i = j = 0; word[i] != '\0'; i++) {
        if (word[i] != ' ') {
            word[j++] = word[i];
        }
    }
    word[j] = '\0';
}

您可以简化compare,以平均使用三分之一的测试:

int compare(const char word1[], const char word2[]) {
    /*
    This is a function that accepts two sorted 
    char arrays (see 'sortWords' below) and
    returns 1 if it finds a different character
    at entry i in either array, or 0 if at no 
    index the arrays have a different character.
    */
    for (int i = 0; word1[i] == word2[i]; i++) {
        if (word1[i]) == '\0')
            //printf("anagrams\n");
            return 0;
        }
    }
    // printf("not anagrams\n");
    return 1;
}

sortWords对于空字符串具有未定义的行为,因为在数组末尾读取了索引char上的1。这是更正的版本:

void sortWords(char word[]) {
    /*
    This is a function to sort the input char arrays
    it's a simple bubble sort on the array elements.
    'sortWords' function accepts a char array and returns void,
    sorting the entries in alphabetical order
    being careful about ignoring the 'special character'
    '\0'.
    */
    for (int j = 0; word[j] != '\0'; j++) {
        for (int i = 1; i < j; i++) {
            if (word[i - 1] > word[i]) {
                char dummy = word[i - 1];
                word[i - 1] = word[i];
                word[i] = dummy;
            }
        }
    }
}

您应在使用前声明函数,或在使用前交替定义它们。代码之所以能够编译,是因为编译器接受旧样式C,在该样式中,从第一个调用站点传递的参数推断出了尚未见过的函数的原型。这种做法容易出错且过时。

您的排序功能具有二次时间复杂度,对于很长的字符串可能会很慢,但是单词不要太大,所以这不是问题。

最好不要修改参数字符串。您可以使用时间复杂度相同的字符串之一进行测试。

这是一种直接方法:

#include <stdio.h>

int check_anagrams(const char word1[], const char word2[]) {
    /*
       This function accepts two strings and returns 1 if they
       are anagrams of one another, ignoring spaces.
       The strings are not modified.
     */
    int i, j, len1, letters1, letters2;

    /* compute the length and number of letters of word1 */
    for (len1 = letters1 = 0; word1[len1] != '\0'; len1++) {
        if (word1[len1] != ' ')
            letters1++;
    }

    /* create a copy of word1 in automatic storage */
    char copy[len1];    /* this is an array, not a string */
    for (i = 0; i < len1; i++)
        copy[i] = word1[i];

    for (j = letters2 = 0; word2[j] != '\0'; j++) {
        char temp = word2[j];
        if (temp != ' ') {
            letters2++;
            for (i = 0; i < len1; i++) {
                if (copy[i] == temp) {
                    copy[i] = '\0';
                    break;
                }
            }
            if (i == len1) {
                /* letter was not found */
                return 0;
            }
        }
    }
    if (letters1 != letters2)
        return 0;
    return 1;
}

int main(int argc, char *argv[]) {
    const char *s1 = " listen";
    const char *s2 = "silent   ";
    if (argc >= 3) {
        s1 = argv[1];
        s2 = argv[2];
    }
    int result = check_anagrams(s1, s2);
    if (result == 0) {
        printf("\"%s\" and \"%s\" are not anagrams\n", s1, s2);
    } else {
        printf("\"%s\" and \"%s\" are anagrams\n", s1, s2);
    }
    return result;
}