Question

我正试图在一个单独的程序中运行chatterbot的TwitterTrainer：

from chatterbot import ChatBot
from chatterbot.trainers import TwitterTrainer
from settings import TWITTER
import logging

# Comment out the following line to disable verbose logging
logging.basicConfig(level=logging.INFO)

chatbot = ChatBot("TwitterBot",
    logic_adapters=[
        "chatterbot.logic.BestMatch"
    ],
    input_adapter="chatterbot.input.TerminalAdapter",
    output_adapter="chatterbot.output.TerminalAdapter",
    database="./twitter-database.db",
    twitter_consumer_key=TWITTER["CONSUMER_KEY"],
    twitter_consumer_secret=TWITTER["CONSUMER_SECRET"],
    twitter_access_token_key=TWITTER["ACCESS_TOKEN"],
    twitter_access_token_secret=TWITTER["ACCESS_TOKEN_SECRET"],
    trainer="chatterbot.trainers.TwitterTrainer",
    random_seed_word="random"
)

chatbot.train()

chatbot.logger.info('Trained database generated successfully!')

我得到的错误看起来像那样：

文件＆＃34; C：\ Python27 \ lib \ json \ decoder.py＆＃34;，第364行，解码 obj，end = self.raw_decode（s，idx = _w（s，0）.end（））文件＆＃34; C：\ Python27 \ lib \ json \ decoder.py＆＃34;，第380行，raw_decode obj，end = self.scan_once（s，idx）UnicodeDecodeError：＆＃39; utf8＆＃39;编解码器不能解码位置94中的字节0x85：无效的起始字节

这个程序没有超过3秒钟的运行时间，但有些推文被写入twitter-database.db，直到发生异常。

另外，当我看到trainer.py时，我看到了这个：

# TODO: Handle non-ascii characters properly

关于为什么会发生这种情况的任何想法以及如何解决这个问题？

Answer 1

您是否可以尝试在文件# -*- coding: utf-8 -*-的顶部添加Python源代码编码。由此会发生这些类型错误。更多信息请点击http://chatterbot.readthedocs.io/en/stable/encoding.html#fixing-encoding-errors

chatterbot twitter_trainer ASCII编码错误

1 个答案: