我试着在一个文本句子中找到超过4个字母的单词 我试过这个:
void scale(Segment& segment, SegmentEnd end, const double& scaleVal)
{
Point& p(segment.*end);
p._x = scaleVal*p._x;
p._y = scaleVal*p._y;
}
我希望将结果作为结果,例如在前面的例子中,句子/字符串有2个单词,长度大于4个字母,第二个单词有2个单词。
使用nchar我从字符串中获取完整长度的字符。
制作它的正确方法是什么?
答案 0 :(得分:1)
library(dplyr)
library(purrr)
# vector of sentences
fullsetence <- as.character(c("A test setence with test length","A second test for length"))
# get vector of counts for words with more than 4 letters
fullsetence %>%
strsplit(" ") %>%
map(~sum(nchar(.) > 4)) %>%
unlist()
# [1] 2 2
# create a dataframe with sentence and the corresponding counts
# use previous code as a function within "mutate"
data.frame(fullsetence, stringsAsFactors = F) %>%
mutate(Counts = fullsetence %>%
strsplit(" ") %>%
map(~sum(nchar(.) > 4)) %>%
unlist() )
# fullsetence Counts
# 1 A test setence with test length 2
# 2 A second test for length 2
如果你想获得超过4个字母的实际单词,你可以用类似的方式使用它:
fullsetence %>%
strsplit(" ") %>%
map(~ .[nchar(.) > 4])
data.frame(fullsetence, stringsAsFactors = F) %>%
mutate(Words = fullsetence %>%
strsplit(" ") %>%
map(~ .[nchar(.) > 4]))