应用错误收集

我想使用python 3.6中的Polyglot对希伯来语文本进行简单的情感分析。问题在于，Polyglot将文本语言代码识别为“ iw”而不是“ he”，因此无法对其进行处理。

如图所示： use polyglot package for Named Entity Recognition in hebrew我已经在hint_language_code = 'he'函数调用中添加了Text，但是它仅更改文本的初始形式，而不更改其子形式（如句子或单词）。

例如：

输入：

import polyglot
from polyglot.text import Text, Word

article='איך ניתן לנתח טקסט בעברית? והאם ניתן לשנות את הקידוד?'
txt = Text(article)
print(txt.language.code)

txt = Text(article,hint_language_code = 'he')
print(txt.language.code)

sent=txt.sentences[1]
print(sent.language.code)
print(sent)

输出：

iw
he
iw
והאם ניתן לשנות את הקידוד?

如何将文本language_code从'iw'永久更改为'he'？

在Polyglot中，有没有一种方法可以将希伯来语文本的语言代码从“ iw”永久“固定”到“ he”？

0 个答案: