如何使用SpaCy枚举一个段落分开的段落的句子

时间:2018-08-24 04:51:05

标签: python spacy

我想用SpaCy逐一阅读一段句子。但是,当尝试列举句子时,我列举的是单词而不是句子。确实,

text = predicted.iloc[0,5]
sentences = spacy_nlp(text)
print(sentences)
for i,sent in enumerate(sentences):
    print("---",i,"---")
    print(sent)

首先给出SpaCy的句子:

['Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress.', "Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child.", "Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time.", 'Their hiatus saw the release of Beyoncé\'s debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy".']

但是它列举了单词而不是句子:

--- 0 ---
[
--- 1 ---
'
--- 2 ---
Beyoncé
--- 3 ---
Giselle
--- 4 ---
Knowles
--- 5 ---
-
--- 6 ---
Carter
...

PS:

感谢X的想法将其转换为列表,使我能够逐句地编写它。

然而,整个想法是将其继续进行nltk_spacy_tree()函数,并且似乎只接受类型为spacy.tokens.doc.Doc的对象,因此我做了以下似乎不太适应的操作。似乎太复杂了:

text = predicted.iloc[0,5]
sentences = list(spacy_nlp(text))
sentences = en_nlp(predicted["context"][0].lower()).sents
#print(type(en_nlp(sentences)))
for i,sent in enumerate(sentences):
    print("---",i,"---")
    print(en_nlp(str(sent)))
    sent = en_nlp(str(sent))
    tree = nltk_spacy_tree(sent)
    print(tree)

0 个答案:

没有答案