Combine the result of the function on a row in one column

时间:2015-05-24 20:45:16

标签: r data.table

I have a large data.table where one column contains text, here is a simple example:

x = data.table(text = c("This is the first text", "Second text"))

I would like to get a data.table with one column containing all the words of all the texts. Here was my try:

x[, strsplit(text, " ")]
                     text
1: This is the first text
2:            Second text

Which results in:

      V1     V2
1:  This Second
2:    is   text
3:   the Second
4: first   text
5:  text Second

The result I would like to get is:

   text
1: This 
2: is
3: the
4: first
5: text
6: Second
7: text

2 个答案:

答案 0 :(得分:3)

You are close and looking for:

data.table(text=unlist(strsplit(x$text, " ")))

#     text
#1:   This
#2:     is
#3:    the
#4:  first
#5:   text
#6: Second
#7:   text

答案 1 :(得分:2)

正如@Henrik在评论中提到的那样,您可以使用cSplit包中的splitstackshape执行此任务:

library(splitstackshape)
cSplit(x, "text", sep = " ", direction = "long")

给出了:

#     text
#1:   This
#2:     is
#3:    the
#4:  first
#5:   text
#6: Second
#7:   text

您还可以创建一个列来帮助识别结果中的初始句子:

x %>% dplyr::mutate(n = 1:n()) %>% cSplit(., "text", " ", "long")

给出了:

#     text n
#1:   This 1
#2:     is 1
#3:    the 1
#4:  first 1
#5:   text 1
#6: Second 2
#7:   text 2