Question

我试着在一个文本句子中找到超过4个字母的单词我试过这个：

void scale(Segment& segment, SegmentEnd end, const double& scaleVal)
{
    Point& p(segment.*end);
    p._x = scaleVal*p._x;
    p._y = scaleVal*p._y;
}

我希望将结果作为结果，例如在前面的例子中，句子/字符串有2个单词，长度大于4个字母，第二个单词有2个单词。

使用nchar我从字符串中获取完整长度的字符。

制作它的正确方法是什么？

Answer 1

library(dplyr)
library(purrr)

# vector of sentences
fullsetence <- as.character(c("A test setence with test length","A second test for length"))

# get vector of counts for words with more than 4 letters
fullsetence %>%
  strsplit(" ") %>%
  map(~sum(nchar(.) > 4)) %>%
  unlist()

# [1] 2 2


# create a dataframe with sentence and the corresponding counts
# use previous code as a function within "mutate" 
data.frame(fullsetence, stringsAsFactors = F) %>%
  mutate(Counts = fullsetence %>%
                   strsplit(" ") %>%
                   map(~sum(nchar(.) > 4)) %>%
                   unlist() )

#                       fullsetence Counts
# 1 A test setence with test length      2
# 2        A second test for length      2

如果你想获得超过4个字母的实际单词，你可以用类似的方式使用它：

fullsetence %>%
  strsplit(" ") %>%
  map(~ .[nchar(.) > 4])

data.frame(fullsetence, stringsAsFactors = F) %>%
  mutate(Words = fullsetence %>%
                 strsplit(" ") %>%
                 map(~ .[nchar(.) > 4]))

计算单词量的字母的具体长度

1 个答案: