Textblob字标记化为数组

时间:2015-07-23 02:52:38

标签: arrays tokenize textblob

from textblob import TextBlob
import nltk
array=("i have a bunch of grapes","i like to eat apple","this is a laptop")
array2=[]



for i in array:

    c=TextBlob(i)
    array2.append(c.words)

print array2

打印出的结果将是:

  
    

[WordList([' i','有',' a','束',''' ,' grape']),WordList([' i','喜欢','到',' eat' ;,' apple']),WordList(['此','是',' a','笔记本电脑' ;])]

  

如何从WordList中提取,以便将我的array2打印为:

  
    

[[' i','有'' a','束',' of', ' grape'],['我'喜欢',' to',' eat',' ; apple'],["这是一台笔记本电脑"]]

  

2 个答案:

答案 0 :(得分:1)

您可以先 list(),然后再将其附加到array2:

for i in array:
    c=TextBlob(i)
    array2.append(list(c.words))

答案 1 :(得分:0)

array2.append(c.words._collection)