Question

正在加载txt文件：

subtitle <- readLines('subtitle.txt')

现在，我想逐个句子遍历文本，例如：

first_sentece <- subtitle[1]

如何在R中这样做？

示例文本：

我认为这不是事实。我认为这很不幸。我认为自己和从事这些电影工作的每个人都喜欢电影，喜欢电影，喜欢看电影，喜欢在一个人满为患的电影院观看公共经历。而且我们很幸运，我们的电影院在放映电影的时候，电影院经常挤满了人，这是很特别的事情。

Answer 1

仅作为正确方向的提示，strsplit在这里可能会有所帮助：

x <- "I think that’s not true. I think it’s unfortunate. I think myself and everybody who works on these movies loves cinema, loves movies, loves going to the movies, loves to watch a communal experience in a movie theater full of people. And we’ve been very lucky that our movie theaters are often full of people when our movies play, and that’s a very special thing."
strsplit(x, "\\.\\s*")[[1]]

这将输出：

[1] "I think that’s not true"                                                                                                                                                             
[2] "I think it’s unfortunate"                                                                                                                                                            
[3] "I think myself and everybody who works on these movies loves cinema, loves movies, loves going to the movies, loves to watch a communal experience in a movie theater full of people"
[4] "And we’ve been very lucky that our movie theaters are often full of people when our movies play, and that’s a very special thing"

此答案假设句号（.）始终表示句子的结尾。例如，如果给定的句子中有首字母缩写词或首字母缩写，那么这当然是不正确的。 J.J. Abrams makes good movies。

如何逐句遍历txt文件？

1 个答案: