如何计算文本字符串变量中的单词数并将其粘贴为表格中的列?

时间:2020-04-28 15:42:00

标签: r

我在R studio中有一个表,在第三列中,我需要使用for和while循环粘贴第二列的单词数。我不知道该怎么做,有人可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

尝试一下...

我已经创建了一个示例数据框(但是,如果您包括一个可重现的示例,这总是有帮助的。)

    some_nums <- c("2","100","16","999", "65")
    the_words <- c("some words", "these are some more words", "and these are even more words too", "now fewer words", "I do not even want to try and count the number of words here so why not just let our code figure it out")

    my_df <- data.frame(some_nums, the_words,stringsAsFactors = FALSE) 

输出是这样的:

  some_nums                                                                                               the_words
1         2                                                                                              some words
2       100                                                                               these are some more words
3        16                                                                       and these are even more words too
4       999                                                                                         now fewer words
5        65 I do not even want to try and count the number of words here so why not just let our code figure it out

现在,我们只需要通过对数据帧所需列中的每个字符串应用字符串拆分函数来计算单词数。 这可以通过使用单词之间的空格作为分隔符或将每个单词分开的东西来完成。我们还可以使用以下代码在同一步骤中轻松地将这些值插入到新列中。

my_df[["number_of_words"]] <- sapply(strsplit(my_df$the_words, " "), length)

为我们提供以下输出:

  some_nums                                                                                               the_words number_of_words
1         2                                                                                              some words               2
2       100                                                                               these are some more words               5
3        16                                                                       and these are even more words too               7
4       999                                                                                         now fewer words               3
5        65 I do not even want to try and count the number of words here so why not just let our code figure it out              24

我希望这会有所帮助。