如何用8个或更少的单词提取字符串中的第一个句子?

时间:2016-11-11 15:48:04

标签: r

如何用8个或更少的单词(或其他条件)提取第一个(或最后一个)句子?例如,我有一个文本

text <- "The quick brown fox. This is wonderful!"

根据本文中的单词数量,提取第一句/最后一句的最优雅方法是什么?

2 个答案:

答案 0 :(得分:1)

如果我们想要找到少于4个单词的第一个句子,那么这样的话,不要认为这是最优雅的方式:

text <- "The quick brown fox. This is wonderful!"
split <- unlist(strsplit(text, "\\. "))
number_words <- sapply(split, function(x) length(unlist(strsplit(x, " "))))
split[which(number_words < 4)[1]]
[1] "This is wonderful!"

答案 1 :(得分:1)

在R中,手腕两次轻弹以组织信息。我为复杂性添加了另一句话:

text="The quick brown fox. This is wonderful! A sentence with eight or more words in it?"
sentence <- strsplit(text, "(?<=[.?!]) ?", perl=TRUE)[[1]]
count <- lengths(strsplit(sent, " "))
condition <- count < 8 
data.frame(sentence, count, condition)
#                                     sentence count condition
# 1                       The quick brown fox.     4      TRUE
# 2                         This is wonderful!     3      TRUE
# 3 A sentence with eight or more words in it?     9     FALSE

#First
df$sentence[df$condition][1]
#[1] "The quick brown fox."

#Last
tail(df$sentence[df$condition],1)
#[1] "This is wonderful!"