我目前正在使用Sentiment140 API对一些推文进行分类。我完全用Python写作。 一切都很顺利,但我在解决输出和存储方面遇到了问题。
我使用以下代码存储检索到的数据:
data = json.dumps(values) # instead of urllib.urlencode(values)
response = urllib2.urlopen(url, data)
page = response.read()
print page
with open('result.json', 'w') as f:
json.dump(page, f, indent=2)
print语句给出了以下内容:
{"data":[{"id":"1","text":"How deep is your love - Micheal Buble Ft Kelly Rowland \\u00e2\\u2122\\u00a5","polarity":2,"meta":{"language":"en"}},{"id":"2","text":"RT @TrueTeenQuotes: #SongsThatNeverGetOld Nelly ft. Kelly Rowland - Dilemma","polarity":2,"meta":{"language":"en"}},{"id":"3","text":"RT @GOforCARL: Dilemma - Nelly Feat. Kelly Rowland #Ohh #SongsThatNeverGetOld","polarity":2,"meta":{"language":"en"}},{"id":"4","text":"#NP Kelly Rowland Grown Woman","polarity":2,"meta":{"language":"en"}}]}
这意味着我需要的所有数据......遗憾地在一行中但是没问题。 现在我尝试以足够和漂亮的格式保存数据。保存的文件如下所示:
"{\"data\":[{\"id\":\"1\",\"text\":\"How deep is your love - Micheal Buble Ft Kelly Rowland \\\\u00e2\\\\u2122\\\\u00a5\",\"polarity\":2,\"meta\":{\"language\":\"en\"}},{\"id\":\"2\",\"text\":\"RT @TrueTeenQuotes: #SongsThatNeverGetOld Nelly ft. Kelly Rowland - Dilemma\",\"polarity\":2,\"meta\":{\"language\":\"en\"}},{\"id\":\"3\",\"text\":\"RT @GOforCARL: Dilemma - Nelly Feat. Kelly Rowland #Ohh #SongsThatNeverGetOld\",\"polarity\":2,\"meta\":{\"language\":\"en\"}},{\"id\":\"4\",\"text\":\"#NP Kelly Rowland Grown Woman\",\"polarity\":2,\"meta\":{\"language\":\"en\"}}]}\n"
仍然在一行中并且所有“\”。我错过了哪个命令?
答案 0 :(得分:0)
自己找到答案。虽然我不知道它是否是最美丽的方式:
# Encode file correctly
final = unicode(page, errors='ignore')
# Load file correctly
final = json.loads(final)
with open('result.json', 'w') as f:
json.dump(final, f, indent=0)
做完了。