Question

我无法将已编码的数据保存到CSV中。之后我可以解码CSV文件，但我宁愿在之前进行所有数据清理。我设法只保存文本，但是当我添加时间戳时，这是不可能的。

我做错了什么？我读过如果srt()和.encode()无法正常工作，应该尝试.join，但仍然没有

错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)

代码：

def on_data(self, data):
    try:
        #print data
        tweet = data.split(',"text":"')[1].split('","source')[0]

        x = tweet.encode('utf-8')
        y = x.decode('unicode-escape')
        print y

        saveThis = y
        #saveThis = str(time.time())+'::' + tweet.decode('ascii', 'ignore')
        #saveThis = u' '.join((time.time()+'::'+tweet)).encode('utf-8')

        saveFile = open('twitDB.csv', 'a')
        saveFile.write(saveThis)
        saveFile.write('\n')
        saveFile.close()
        return True
    except BaseException, e:
        print 'fail on data,', str(e)
        time.sleep(5) 
def on_error(self, status):
    print status

Answer 1

首先，确保使用json module正确处理您的JSON数据。

接下来，不要抓住BaseException，你没有理由在这里捕获内存错误或键盘中断。相反，要抓住更具体的例外情况。

接下来，在写入之前对数据进行编码：

def on_data(self, data):
    try:
        tweet = json.loads(data)['text']
    except (ValueError, KeyError), e:
        # Not JSON or no text key
        print 'fail on data {}'.format(data)
        return

   with open('twitDB.csv', 'a') as save_file:
        save_file.write(tweet.encode('utf8') + '\n')
        return True

Python Utf-8写入CSV

1 个答案: