如何在C中解析字符串?

时间:2012-11-05 10:33:22

标签: c

我是初学者学习C;所以,请放轻松我。 :)

我正在尝试编写一个非常简单的程序,将字符串中的每个单词都变为“Hi(输入)!”句子(它假定你输入名字)。此外,我正在使用数组,因为我需要练习它们。

我的问题是,某些垃圾被放入某个地方的数组中,并且它会弄乱程序。我试图找出问题,但无济于事;所以,是时候寻求专家的帮助了。我在哪里犯过错误?

p.s:它在某处也有一个无限循环,但它可能是放入数组的垃圾的结果。

#include <stdio.h>
#define MAX 500 //Maximum Array size.

int main(int argc, const char * argv[])
{
    int stringArray [MAX];
    int wordArray [MAX];
    int counter = 0;
    int wordCounter = 0;

    printf("Please type in a list of names then hit ENTER:\n");  
    // Fill up the stringArray with user input.
    stringArray[counter] = getchar();
    while (stringArray[counter] != '\n') {
        stringArray[++counter] = getchar();
    }

    // Main function.
    counter = 0;
    while (stringArray[wordCounter] != '\n') {     
        // Puts first word into temporary wordArray.
        while ((stringArray[wordCounter] != ' ') && (stringArray[wordCounter] != '\n')) {
            wordArray[counter++] = stringArray[wordCounter++];
        }
        wordArray[counter] = '\0';

        //Prints out the content of wordArray.
        counter = 0;
        printf("Hi ");
        while (wordArray[counter] != '\0') {
            putchar(wordArray[counter]);
            counter++;
        }
        printf("!\n");

        //Clears temporary wordArray for new use.
        for (counter = 0; counter == MAX; counter++) {
            wordArray[counter] = '\0';
        } 
        wordCounter++;
        counter = 0; 
    }
    return 0;
}

解决了!当我增加wordCounter时,我需要添加以下句子到句尾。 :)

    if (stringArray[wordCounter] != '\n') {
            wordCounter++;
    }

3 个答案:

答案 0 :(得分:4)

您正在使用int数组来表示字符串,可能是因为getchar()int中返回。但是,字符串更好地表示为char数组,因为它就是C语言。getchar()返回int的事实当然令人困惑,因为它需要能够返回特殊值EOF,该值不适合char。因此,它使用int,这是一种“更大”的类型(能够表示更多不同的值)。因此,它可以适合所有char值, EOF

使用char数组,您可以直接使用C的字符串函数:

char stringArray[MAX];

if(fgets(stringArray, sizeof stringArray, stdin) != NULL)
   printf("You entered %s", stringArray);

请注意,fscanf()会在字符串中留下行尾字符,因此您可能希望将其删除。我建议实现一个可以修剪前导和尾随空格的就地函数,这也是一个很好的练习。

答案 1 :(得分:3)

    for (counter = 0; counter == MAX; counter++) {
        wordArray[counter] = '\0';
    } 

你永远不会进入这个循环。

答案 2 :(得分:2)

user1799795,

为了它的价值(现在你已经解决了你的问题)我冒昧地告诉你我如何做这个,因为限制“使用数组”,并解释一下我为什么这样做方式......请注意,虽然我是经验丰富的程序员,但我不是C大师......我和那些绝对让我陷入C-weeds(双关语)的人一起工作。

#include <stdio.h>
#include <string.h>

#define LINE_SIZE 500
#define MAX_WORDS 50
#define WORD_SIZE 20

// Main function.
int main(int argc, const char * argv[])
{
    int counter = 0;

    // ----------------------------------
    // Read a line of input from the user (ie stdin)
    // ----------------------------------
    char line[LINE_SIZE];
    printf("Please type in a list of names then hit ENTER:\n");
    while ( fgets(line, LINE_SIZE, stdin) == NULL )
        fprintf(stderr, "You must enter something. Pretty please!");

    // A note on that LINE_SIZE parameter to the fgets function:
    // wherever possible it's a good idea to use the version of the standard
    // library function that allows you specificy the maximum length of the
    // string (or indeed any array) because that dramatically reduces the
    // incedence "string overruns", which are a major source of bugs in c
    // programmes.
    // Also note that fgets includes the end-of-line character/sequence in
    // the returned string, so you have to ensure there's room for it in the
    // destination string, and remember to handle it in your string processing.

    // -------------------------
    // split the line into words
    // -------------------------

    // the current word
    char word[WORD_SIZE];
    int wordLength = 0;

    // the list of words
    char words[MAX_WORDS][WORD_SIZE]; // an array of upto 50 words of
                                      // upto 20 characters each
    int wordCount = 0;                // the number of words in the array.


    // The below loop syntax is a bit cyptic.
    // The "char *c=line;" initialises the char-pointer "c" to the start of "line".
    // The " *c;" is ultra-shorthand for: "is the-char-at-c not equal to zero".
    //   All strings in c end with a "null terminator" character, which has the
    //   integer value of zero, and is commonly expressed as '\0', 0, or NULL
    //   (a #defined macro). In the C language any integer may be evaluated as a
    //   boolean (true|false) expression, where 0 is false, and (pretty obviously)
    //   everything-else is true. So: If the character at the address-c is not
    //   zero (the null terminator) then go-round the loop again. Capiche?
    // The "++c" moves the char-pointer to the next character in the line. I use
    // the pre-increment "++c" in preference to the more common post-increment
    // "c++" because it's a smidge more efficient.
    //
    // Note that this syntax is commonly used by "low level programmers" to loop
    // through strings. There is an alternative which is less cryptic and is
    // therefore preferred by most programmers, even though it's not quite as
    // efficient. In this case the loop would be:
    //    int lineLength = strlen(line);
    //    for ( int i=0; i<lineLength; ++i)
    // and then to get the current character
    //        char ch = line[i];
    // We get the length of the line once, because the strlen function has to
    // loop through the characters in the array looking for the null-terminator
    // character at its end (guess what it's implementation looks like ;-)...
    // which is inherently an "expensive" operation (totally dependant on the
    // length of the string) so we atleast avoid repeating this operation.
    //
    // I know I might sound like I'm banging on about not-very-much but once you
    // start dealing with "real word" magnitude datasets then such habits,
    // formed early on, pay huge dividends in the ability to write performant
    // code the first time round. Premature optimisation is evil, but my code
    // doesn't hardly ever NEED optimising, because it was "fairly efficient"
    // to start with. Yeah?

    for ( char *c=line; *c; ++c ) {    // foreach char in line.

        char ch = *c;  // "ch" is the character value-at the-char-pointer "c".

        if ( ch==' '               // if this char is a space,
          || ch=='\n'              // or we've reached the EOL char
        ) {
            // 1. add the word to the end of the words list.
            //    note that we copy only wordLength characters, instead of
            //    relying on a null-terminator (which doesn't exist), as we
            //    would do if we called the more usual strcpy function instead.
            strncpy(words[wordCount++], word, wordLength);
            // 2. and "clear" the word buffer.
            wordLength=0;
        } else if (wordLength==WORD_SIZE-1) { // this word is too long
            // so split this word into two words.
            strncpy(words[wordCount++], word, wordLength);
            wordLength=0;
            word[wordLength++] = ch;
        } else {
            // otherwise: append this character to the end of the word.
            word[wordLength++] = ch;
        }
    }

    // -------------------------
    // print out the words
    // -------------------------

    for ( int w=0; w<wordCount; ++w ) {
        printf("Hi %s!\n", words[w]);
    }
    return 0;
}

在现实世界中,人们不能对单词的最大长度或者有多少单词作出这样的限制性假设,如果给出这样的限制,它们几乎总是任意的,因此很快就被证明是错误的。 ..对于这个问题如此直截了当,我倾向于使用链表而不是“单词”数组......等到你进入“动态数据结构”......你'我爱他们; - )

干杯。基思。

PS:你的进展非常顺利......我的建议是“继续保持卡车运行”......通过练习,这会变得更容易。