Question

我正在尝试仅提取法语写的短语。我尝试使用多语言。一句话就能奏效：

comments = "hoje esta frio"
detector = Detector(comentarios)
print(detector.language)

输出：

名称：葡萄牙语代码：pt可信度：93.0读取字节：1706

但是当我有几种语言的几种短语时，它不起作用：

文本

hoje esta frio

我古斯塔·莫索

je suis malade

非和平广场

你好，我要

bonjour，兜售va bien

import polyglot
from polyglot.text import Text, Word
from polyglot.detect import Detector

with open('test_langages.txt', 'r') as f:
    comments = f.read().split('\n')

    for c in comments:
        detector = Detector(c)
        print c, detector.language

输出：

hoje esta frio名称：葡萄牙语代码：pt可信度：93.0读取字节：1706

找不到记录器“ polyglot.detect.base”的处理程序

我gusta mucho名称：espagnol代码：es可信度：93.0读取字节：2525

je suis malade Traceback（最近拨打过一次）：     在第12行的文件“ language.py”中       打印c，detector.language      str 中的文件“ /usr/local/lib/python2.7/dist-packages/polyglot/detect/base.py”，第43行       self.confidence，self.read_bytes））   UnicodeEncodeError：'ascii'编解码器无法在位置4编码字符u'\ xe7'：序数不在range（128）

多种语言检测

0 个答案: