关于以UTF-8
保存
我有一个python脚本,它使用Twitter API并在JSON结构中保存一些推文。
当涉及包含á,é,í,ó或ú等字符的文字时,我不会正确保存该文字,而是替换为 \u00f1
等模式:
这是我的脚本:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import twitter
import json
from api.search import *
from api.tpt import *
from tweet.tweet import *
from time import *
WORLD_WOE_ID = 23424900
trendingTopics = startTrendsDictionary()
twitterAPI = createAPIConsumer()
print "Twitter API started"
while True:
mexican_trends = twitterAPI.trends.place(_id=WORLD_WOE_ID)
print "Current trends fetched"
with open ('../files/trends/trends '+strftime("%c"), 'w') as trendsFile:
trendsFile.write(json.dumps(mexican_trends,indent=1))
print "Current trends at +- {t} saved".format(t=strftime("%c"))
statuses = harvestTrendingTopicTweets(twitterAPI,trendingTopics,100,10)
print "Harvest done"
with open('../files/tweets/unclassified '+strftime("%c"), 'w') as statusesFile:
statusesFile.write(json.dumps(statuses,indent=1).encode('utf-8'))
print "File saved"
print "We're going to wait in order not to fed up Twitter API"
sleep(2400)
print "OK, it was enough waiting, here we go again"
我认为两者都是:
# -*- coding: utf-8 -*-
.encode('utf-8'))
会解决它,但他们没有。
关于以UTF-8
阅读
说到读,我正在尝试:
import json
with open('file', 'r', buffering=1) as f:
tweetsJSON = json.load(f)
for category in tweetsJSON:
for trend in tweetsJSON[category]:
for t in tweetsJSON[category][trend]:
print t['text']
print
在这种情况下,打印到控制台,* 我可以看到所有这些字母正确显示。*
那么,为什么当我用文本编辑器(Sublime Text,在我的情况下)打开保存的文件时,它们看起来不会好看?