我想将从twitter api提取的json加载到python中。附件是json对象的示例:
{"created_at":"Mon Apr 22 18:17:09 +0000 2019","id":1120391103813910529,"id_str":"1120391103813910529","text":"On peut dire que la base de cette 8e saison est en place \ud83d\ude4c #GOTS8E2","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":243071138,"id_str":"243071138","name":"Mr B","screen_name":"skeyos","location":"Namur","url":null,"description":null,"translator_type":"none","protected":false,"verified":false,"followers_count":197,"friends_count":1811,"listed_count":6,"favourites_count":7826,"statuses_count":8044,"created_at":"Wed Jan 26 06:49:05 +0000 2011","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"fr","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/493833348167770112\/aGLGemZ5_normal.jpeg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/493833348167770112\/aGLGemZ5_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/243071138\/1406574068","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"GOTS8E2","indices":[59,67]}],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"fr","timestamp_ms":"1555957029666"}
{"created_at":"Mon Apr 22 18:17:14 +0000 2019","id":1120391124722565123,"id_str":"1120391124722565123","text":"...
我正在尝试以下代码:
with open('tweets.json') as tweet_data:
json_data = json.load(tweet_data)
但是出现以下错误:
JSONDecodeError: Extra data: line 3 column 1 (char 2149)
不幸的是,我不可能对json对象进行太多编辑,因为它确实很大。我需要弄清楚如何将其读入Python。任何帮助将不胜感激!
编辑:可以使用以下代码:
dat=list()
with open ('data_tweets_E2.json', 'r') as f:
for l in f.readlines():
if not l.strip (): # skip empty lines
continue
json_data = json.loads (l)
dat.append(json_data)
答案 0 :(得分:1)
每行都包含一个新对象,因此请尝试逐行解析它们。
object Database {
trait PrimaryKey[A <: Entity[A], B <: PrimaryKey[A, B]] {
this: B =>
}
trait Entity[A <: Entity[A]] {
this: A =>
type K <: PrimaryKey[A, K]
val id: K
}
class FooKey[A <: Foo[A]] extends PrimaryKey[A, FooKey[A]]
trait Foo[A <: Foo[A]] extends Entity[A] {
this: A =>
override type K = FooKey[A]
}
class FooImpl(val id: FooKey[FooImpl]) extends Foo[FooImpl]
}
答案 1 :(得分:1)
每行包含一个单独的json对象,解析并将它们存储在列表中:
with open('tweets.json', 'r') as tweet_data:
values = [json.loads(line) for line in tweet_data.readlines()
if not line.strip()]
答案 2 :(得分:1)
这是代码。您首先需要安装Pandas。如果解决方案对您有所帮助,请在绿色答案上标记此答案。
render() {
return (
<div>
<button onClick={ () => this.setState(state => { object: { ...state.object, a: 5 }})}>change value a </button>
<button onClick={ () => this.setState(state => { object: { ...state.object, b: 2 }})}>change value b </button>
<button onClick={ () => this.setState(state => { object: { ...state.object, c: 5 }})}>change value c </button>
</div>
<p>{this.state.object.a}</p>
<p>{this.state.object.b}</p>
<p>{this.state.object.c}</p>
);
}
因此您可以看到import json
import pandas as pd
with open('tweets.json') as json_file:
data_list = json.load(json_file)
tweet_data_frame = pd.DataFrame.from_dict(data_list)
print(tweet_data_frame)
print(data_list)
打印出一个列表,print(data_list)
打印出数据框。
如果要查看这些变量的类型,只需使用type()print(tweet_data_frame)
重要:我试图告诉您的是,您的JSON文件格式错误,并且存在很多错误。如果您有更多的JSON对象,则它们必须位于数组print(type(data_list))
中。您的JSON文件有错误。尝试使用其他JSON文件。