Question

修改：我确实尝试将行arr_of_strings[arr_index_count] = first_word;更改为strcpy(arr_of_strings[arr_index_count], first_word);，但在打印Word is: This

后会出现分段错误

编辑2：我试图在没有strtok的情况下这样做，因为我认为这是学习C字符串的好方法。

尝试自己学习C语言。决定创建一个接受字符串的函数，并将字符串中的每个单词放入数组中的元素中。这是我的代码：

假设#define MAX_LENGTH = 80

// char *string_one[unknown_size];

// first_word will represent each word in the sentence
char first_word[MAX_LENGTH + 1] = "";

// this is the array I will store each word in
char *arr_of_strings[MAX_LENGTH];

int index_count = 0;
int arr_index_count = 0;

char sentence[] = "This is a sentence.";

for (int i = 0; i<MAX_LENGTH; i++) {
    printf("Dealing with char: %c\n", sentence[i]); 

    if (sentence[i] == '\0') {
        // end of sentence
        break;
    } else if (sentence[i] ==  ' ') {
        // this signifies the end of a word
        printf("Word is: %s\n", first_word);
        arr_of_strings[arr_index_count] = first_word;
        // after putting the word in the string, make the word empty again
        strcpy(first_word, "");
        // verify that it is empty
        printf("First word is now: %s\n", first_word);

        index_count = 0;
        arr_index_count++;
    } else {
        // not the start of a new string... so keep appending the letter to first_word
        printf("Letter to put in first_word is: %c\n", sentence[i]);
        first_word[index_count] = sentence[i];
        index_count++;
    }
}

printf("-----------------\n");
for (int j = 0; j<=arr_index_count; j++) {
    printf("%s\n", arr_of_strings[j]);
}

这打印的是：

Dealing with char: T
Letter to put in first_word is: T
Dealing with char: h
Letter to put in first_word is: h
Dealing with char: i
Letter to put in first_word is: i
Dealing with char: s
Letter to put in first_word is: s
Dealing with char:  
Word is: This
First word is now: 
Dealing with char: i
Letter to put in first_word is: i
Dealing with char: s
Letter to put in first_word is: s
Dealing with char:  
Word is: isis
First word is now: 
Dealing with char: a
Letter to put in first_word is: a
Dealing with char:  
Word is: asis
First word is now: 
Dealing with char: s
Letter to put in first_word is: s
Dealing with char: e
Letter to put in first_word is: e
Dealing with char: n
Letter to put in first_word is: n
Dealing with char: t
Letter to put in first_word is: t
Dealing with char: e
Letter to put in first_word is: e
Dealing with char: n
Letter to put in first_word is: n
Dealing with char: c
Letter to put in first_word is: c
Dealing with char: e
Letter to put in first_word is: e
Dealing with char: .
Letter to put in first_word is: .
Dealing with char: 
-----------------
sentence.
sentence.
sentence.

如果我们看这里：

First word is now: 
Dealing with char: i
Letter to put in first_word is: i
Dealing with char: s
Letter to put in first_word is: s
Dealing with char:  
Word is: isis

为什么当单词为空时，我们将i和s放入其中，单词现在为isis？（与asis相同）。
为什么单词sentence被打印3次？我的算法显然有缺陷，但如果有的话，不应该打印sentence这个词4次（对于句子中的每个单词一次：这是一个句子）？

此外，我只是学习C，所以如果还有其他方法可以改进算法，请告诉我。

Answer 1

arr_of_strings只是一个char指针数组，然后您将所有单词指向数组first_word。此外，您不使用C语言字符串所需的空终止符。

这是一种可能对您有帮助的方法，它使用strtok：

#include <string.h>
#include <stdio.h>

#define N 100
#define LEN 20 // max length of a word

int fill(char matrix[N][LEN], char* data)
{
    // How many words in 'data'?
    int counter = 0;
    char * pch;
    // Splits 'data' to tokens, separated by a whitespace
    pch = strtok (data," ");
    while (pch != NULL)
    {
        // Copy a word to the correct row of 'matrix'
        strcpy(matrix[counter++], pch);
        //printf ("%s\n",pch);
        pch = strtok (NULL, " ");
    }
    return counter;
}

void print(char matrix[N][LEN], int words_no)
{
   for(int i = 0; i < words_no; ++i)
       printf("%s\n", matrix[i]);
}

int main(void)
{
    char data[] = "New to the C programming language";
    // We will store each word of 'data' to a matrix, of 'N' rows and 'LEN' columns
    char matrix[N][LEN] = {0};
    int words_no;
    // 'fill()' populates 'matrix' with 'data' and returns the number of words contained in 'data'.
    words_no = fill(matrix, data);
    print(matrix, words_no);
    return 0;
}

输出：

New
to
the
C
programming
language

Answer 2

1）这种情况正在发生，因为你没有添加＆＃39; \ 0＆＃39;在打印之前到单词的末尾。在您的程序遇到第一个空格后，first_word看起来像{'T', 'h', 'i', 's', '\0', '\0', ...}，并打印出来就好了。致电strcpy(first_word, "")会将此更改为{'\0', 'h', 'i', 's', '\0', ...}，然后阅读下一个单词＆＃34;＆＃34;覆盖导致{'i', 's', 'i', 's', '\0', ...}的字符串的前两个字符，因此first_word现在是字符串＆＃34; isis＆＃34;如输出中所示。只需在打印字符串之前添加first_word[index_count] = '\0'即可解决此问题。

2.1）这个数组在每个索引中包含相同字符串的原因是因为你的字符串数组arr_of_strings是一个字符串指针数组，最终都指向同一个字符串first_word，它将包含最后一个字符串循环结束时句子的单词。这可以通过几种方式解决，其中一种方法是使arr_of_strings成为像char arr_of_strings[MAX_STRINGS][MAX_LENGTH]这样的二维数组，然后使用strcpy(arr_of_strings[arr_index_count], first_word)

将字符串添加到该数组中

2.2）最后它只打印＆＃34;句子的原因。＆＃34;三次是因为你只检查一个空格来表示一个单词的结尾。＆＃34;句子＆＃34。以null终结符结束＆＃39; \ 0＆＃39;所以它从未添加到单词数组中，输出也没有一行＆＃34; Word是：句子。＆＃34;

Answer 3

尝试在没有strtok的情况下执行此操作，因为我认为这是学习C字符串的好方法。

是的，那就是精神！

我已经在之前的回答中解释了您的代码的一些问题，所以现在我将发布一个无strtok的解决方案，这肯定会帮助您了解字符串发生了什么。将使用基本指针算法。

专业提示：使用一张纸绘制数组（data和matrix），密切关注其计数器的值，然后运行该文件中的程序。

代码：

#include <string.h>
#include <stdio.h>

#define N 100
#define LEN 20 // max length of a word

int fill(char matrix[N][LEN], char* data)
{
    // How many words in 'data'?
    int counter = 0;
    // Array to store current word
    char word[LEN];
    // Counter 'i' for 'word'
    int i;
    // Wihle there is still something to read from 'data'
    while(*data != '\0')
    {
        // We seek a new word
        i = 0;
        // While the current character of 'data' is not a whitespace or a null-terminator
        while(*data != ' ' && *data != '\0')
            // copy that character to word, and increment 'i'. Move to the next character of 'data'.
            word[i++] = *data++;
        // Null-terminate 'word'. 'i' is already at the value we desire, from the line above.
        word[i] = '\0';
        // If the current of 'data' is not a null-terminator (thus it's a whitespace)
        if(*data != '\0')
            // Increment the pointer, so that we skip the whitespace (and be ready to read the next word)
            data++;
        // Copy the word to the counter-th row of the matrix, and increment the counter
        strcpy(matrix[counter++], word);
    }

    return counter;
}

void print(char matrix[N][LEN], int words_no)
{
   for(int i = 0; i < words_no; ++i)
       printf("%s\n", matrix[i]);
}

int main(void)
{
    char data[] = "Alexander the Great";
    // We will store each word of 'data' to a matrix, of 'N' rows and 'LEN' columns
    char matrix[N][LEN] = {0};
    int words_no;
    // 'fill()' populates 'matrix' with 'data' and returns the number of words contained in 'data'.
    words_no = fill(matrix, data);
    print(matrix, words_no);
    return 0;
}

输出：

Alexander
the
Great

代码的要点在于函数fill()，它取data并且：

找到一个词。
将逐个字符的字词存储到名为word的数组中。
将该字词复制到matrix。

棘手的部分是找到这个词。你需要遍历字符串并在遇到空格时停止，这表明我们在那次迭代中读取的每个字符实际上都是一个单词的字母。

但是，在搜索字符串的最后一个单词时需要小心，因为当您到达该点时，您将不会遇到空格。因此，你应该小心到达字符串的末尾;换句话说：空终止符。

当你这样做的时候，复制矩阵中的最后一个单词，你就完成了，但要确保正确更新指针（这就是我给你的论文想法将有助于理解）。

Answer 4

基于我的strtok-free answer，我编写了一些使用N char指针数组的代码，而不是硬编码的2D矩阵。

char matrix[N][LEN]是一个2D数组，能够存储最多N个字符串，其中每个字符串的最大长度为LEN。 char *ptr_arr[N]是N个char指针的数组。所以它最多可以存储N个字符串，但是没有定义每个字符串的长度。

当前的方法允许通过为每个字符串分配尽可能多的内存来节省一些空间。使用硬编码的2D数组，您可以为任何字符串使用相同的内存;因此，如果您假设字符串的长度可以是20，那么您将分配一个大小为20的内存块，无论您存储的是哪个字符串，其大小可能小于20，或者 - 甚至更糟 - 更大。在后一种情况下，您需要切断字符串，或者如果代码未仔细写入，则通过超出存储字符串的数组的边界来调用 Undefined Behavior 。

用指针＆＃39;方法我们不需要担心，并且可以为每个字符串分配所需的空间，但是一如既往地存在权衡。我们能够做到这一点并节省一些空间，但我们需要动态分配内存（当完成它时，取消分配它; C中没有垃圾收集器，例如在Java中）。动态分配是一个强大的工具，但需要我们花费更多的开发时间。

因此，在我的例子中，我们将遵循与之前相同的逻辑（关于我们如何从字符串中找到单词等），但我们会小心将字存储在矩阵中。

找到一个单词并存储在临时数组word中后，我们可以使用strlen()找出单词的确切长度。我们将动态分配与单词建议的长度完全相同的空间，加上空终止符的1，所有C字符串应该具有（因为<string.h>依赖于它来查找字符串的结尾）。 / p>

因此，对于存储第一个单词＆＃34; Alexander＆＃34;，我们需要这样做：

ptr_arr[0] = malloc(sizeof(char) * (9 + 1));

其中9是strlen("Alexander")的结果。请注意，我们要求的内存块大小等于char的大小，乘以10. char的大小为1，因此在这种情况下它不做任何更改，但一般情况下你应该使用它（因为你可能想要其他数据类型甚至结构等）。

我们使数组的第一个指针指向我们刚刚动态分配的内存块。现在这个内存块属于我们，因此允许我们在其中存储数据（在我们的例子中是单词）。我们使用strcpy()执行此操作。

之后我们继续打印文字。

现在我们已经完成了，例如，您将完成为程序编写代码。但是现在，既然我们动态分配内存，我们需要free()它！这是人们常犯的错误;忘记释放他们要求的记忆！

我们通过释放指向malloc()返回的内存的每个指针来实现这一点。因此，如果我们调用malloc() 10次，那么free()也应该调用10次 - 否则会发生内存泄漏！

说够了，这是代码：

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

#define N 100

int fill(char* ptr_arr[N], char* data)
{
    // How many words in 'data'?
    int counter = 0;
    // Array to store current word, assuming max length will be 50
    char word[50];
    // Counter 'i' for 'word'
    int i;
    // Wihle there is still something to read from 'data'
    while(*data != '\0')
    {
        // We seek a new word
        i = 0;
        // While the current character of 'data' is not a whitespace or a null-terminator
        while(*data != ' ' && *data != '\0')
            // copy that character to word, and increment 'i'. Move to the next character of 'data'.
            word[i++] = *data++;
        // Null-terminate 'word'. 'i' is already at the value we desire, from the line above.
        word[i] = '\0';
        // If the current of 'data' is not a null-terminator (thus it's a whitespace)
        if(*data != '\0')
            // Increment the pointer, so that we skip the whitespace (and be ready to read the next word)
            data++;
        // Dynamically allocate space for a word of length `strlen(word)`
        // plus 1 for the null terminator. Assign that memory chunk to the
        // pointer positioned at `ptr_arr[counter]`.
        ptr_arr[counter] = malloc(sizeof(char) * (strlen(word) + 1));
        // Now, `ptr_arr[counter]` points to a memory block, that will
        // store the current word.

        // Copy the word to the counter-th row of the ptr_arr, and increment the counter
        strcpy(ptr_arr[counter++], word);
    }

    return counter;
}

void print(char* matrix[N], int words_no)
{
   for(int i = 0; i < words_no; ++i)
       printf("%s\n", matrix[i]);
}

void free_matrix(char* matrix[N], int words_no)
{
   for(int i = 0; i < words_no; ++i)
       free(matrix[i]);
}

int main(void)
{
    char data[] = "Alexander the Great";
    // We will store each word of 'data' to a matrix, of 'N' rows and 'LEN' columns
    char *matrix[N];
    int words_no;
    // 'fill()' populates 'matrix' with 'data' and returns the number of words contained in 'data'.
    words_no = fill(matrix, data);
    print(matrix, words_no);
    free_matrix(matrix, words_no);
    return 0;
}

输出：

Alexander
the
Great

处理strcpy时字符串没有被正确清空和分配（字符串，＆＃34;＆＃34;）

4 个答案: