拆分R中的每个字符

时间:2014-04-22 00:49:43

标签: r substring

我有song.txt个文件

*****
[1]"The snow glows white on the mountain tonight
Not a footprint to be seen."
[2]"A kingdom of isolation,
and it looks like I'm the Queen"
[3]"The wind is howling like this swirling storm inside
Couldn't keep it in;
Heaven knows I've tried"
*****
[4]"Don't let them in,
don't let them see"
[5]"Be the good girl you always have to be
Conceal, don't feel,
don't let them know"
[6]"Well now they know"
*****

我想循环播放歌词并填写每个列表的元素 列表中的每个元素都包含一个字符向量,其中向量的每个元素都是歌曲中的一个单词。

[1] "The" "snow" "glows" "white" "on" "the" "mountain" "tonight" "Not" "a" "footprint"
    "to" "be" "seen." "A" "kingdom" "of" "isolation," "and" "it" "looks" "like" "I'm" "the"     
    "Queen" "The" "wind" "is" "howling" "like" "this" "swirling" "storm" "inside"
    "Couldn't" "keep" "it" "in" "Heaven" "knows" "I've" "tried"
[2]"Don't" "let" "them" "in,""don't" "let" "them" "see" "Be" "the" "good" "girl" "you"  
   "always" "have" "to" "be" "Conceal," "don't" "feel," "don't" "let" "them" "know"
   "Well" "now" "they" "know"

首先,我使用words <- vector("list", 2)创建了一个空列表。

我认为我应该首先将文本放入一个长字符向量中,其中分隔符*****与开始和停止有关。与

star="\\*{5}"
pindex = grep(star, page)

在此之后该怎么办?

2 个答案:

答案 0 :(得分:0)

听起来你想要的是strsplit,有效地运行两次。因此,从&#34开始,由****和空格分隔的单个长字符串&#34; (我假设你拥有的是什么?):

list_of_vectors <- lapply(strsplit(song, split = "\\*{5}"), function(x) {

  #Split each verse by spaces
  split_verse <- strsplit(x, split = " ")

  #Then return it as a vector
  return(unlist(split_verse))

})

结果应该是每节经文的列表,每个元素由该节中每个单词的向量组成。如果您没有处理读入对象中的单个字符串,请向我们显示该文件以及您如何阅读该文件;)。

答案 1 :(得分:0)

要使其达到您想要的格式,也许可以尝试一下。另外,请更新您的帖子以获取更多信息,以便我们最终解决您的问题。您发布的问题有几个方面需要澄清。希望这可以帮助。

## writeLines(text <- "*****
## The snow glows white on the mountain tonight
## Not a footprint to be seen.
## A kingdom of isolation,
## and it looks like I'm the Queen
## The wind is howling like this swirling storm inside
## Couldn't keep it in;
## Heaven knows I've tried
## *****
## Don't let them in,
## don't let them see
## Be the good girl you always have to be Conceal,
## don't feel,
## don't let them know
## Well now they know
## *****", "song.txt")

> read.song <- readLines("song.txt")
> split.song <- unlist(strsplit(read.song, "\\s"))
> star.index <- grep("\\*{5}", split.song)
> word.index <- sapply(2:length(star.index), function(i){
    (star.index[i-1]+1):(star.index[i]-1)
    })
> lapply(seq(word.index), function(i) split.song[ word.index[[i]] ])
## [[1]]
##  [1] "The"        "snow"       "glows"      "white"      "on"         "the"        "mountain"  
##  [8] "tonight"    "Not"        "a"          "footprint"  "to"         "be"         "seen."     
## [15] "A"          "kingdom"    "of"         "isolation," "and"        "it"         "looks"     
## [22] "like"       "I'm"        "the"        "Queen"      "The"        "wind"       "is"        
## [29] "howling"    "like"       "this"       "swirling"   "storm"      "inside"     "Couldn't"  
## [36] "keep"       "it"         "in;"        "Heaven"     "knows"      "I've"       "tried"     

## [[2]]
##  [1] "Don't"    "let"      "them"     "in,"      "don't"    "let"      "them"     "see"      "Be"      
## [10] "the"      "good"     "girl"     "you"      "always"   "have"     "to"       "be"       "Conceal,"
## [19] "don't"    "feel,"    "don't"    "let"      "them"     "know"     "Well"     "now"      "they"    
## [28] "know"