sumy LexRankSummarizer()输出文本的正确格式

时间:2018-11-26 04:56:40

标签: python nlp

我正在尝试使用LexRankSummarizer库中的sumy以字符串形式获取输出。 我正在使用以下代码(非常简单)

parser = PlaintextParser.from_string(text,Tokenizer('english'))
summarizer = LexRankSummarizer()
sum_1 = summarizer(parser.document,10)
sum_lex=[]
for sent in sum_1:  
    sum_lex.append(sent)

使用上述代码,我得到的输出形式为tuple。考虑以下text作为输入的摘要

The Mahājanapadas were sixteen kingdoms or oligarchic republics that existed in ancient India from the sixth to fourth centuries BCE.
Two of them were most probably ganatantras (republics) and others had forms of monarchy.

使用上面的代码,我得到的输出为

sum_lex = [<Sentence: The Mahājanapadas were sixteen kingdoms or oligarchic republics that existed in ancient India from the sixth to fourth centuries BCE.>,
 <Sentence: Two of them were most probably ganatantras (republics) and others had forms of monarchy.>]

但是,如果我使用print(sent),则会得到上面给出的正确输出。 如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

sum_lex.append(sent)代替sum_lex.append(str(sent))