如何将Twitter json对象加载到python中

时间:2019-04-30 23:10:05

标签: python json

我想将从twitter api提取的json加载到python中。附件是json对象的示例:

{"created_at":"Mon Apr 22 18:17:09 +0000 2019","id":1120391103813910529,"id_str":"1120391103813910529","text":"On peut dire que la base de cette 8e saison est en place \ud83d\ude4c #GOTS8E2","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":243071138,"id_str":"243071138","name":"Mr B","screen_name":"skeyos","location":"Namur","url":null,"description":null,"translator_type":"none","protected":false,"verified":false,"followers_count":197,"friends_count":1811,"listed_count":6,"favourites_count":7826,"statuses_count":8044,"created_at":"Wed Jan 26 06:49:05 +0000 2011","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"fr","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/493833348167770112\/aGLGemZ5_normal.jpeg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/493833348167770112\/aGLGemZ5_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/243071138\/1406574068","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"GOTS8E2","indices":[59,67]}],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"fr","timestamp_ms":"1555957029666"}

{"created_at":"Mon Apr 22 18:17:14 +0000 2019","id":1120391124722565123,"id_str":"1120391124722565123","text":"...

我正在尝试以下代码:

with open('tweets.json') as tweet_data:
    json_data = json.load(tweet_data)

但是出现以下错误:

JSONDecodeError: Extra data: line 3 column 1 (char 2149)

不幸的是,我不可能对json对象进行太多编辑,因为它确实很大。我需要弄清楚如何将其读入Python。任何帮助将不胜感激!

编辑:可以使用以下代码:

dat=list()
with open ('data_tweets_E2.json', 'r') as f:
    for l in f.readlines():
        if not l.strip (): # skip empty lines
            continue

        json_data = json.loads (l)
        dat.append(json_data)

3 个答案:

答案 0 :(得分:1)

每行都包含一个新对象,因此请尝试逐行解析它们。

  object Database {
    trait PrimaryKey[A <: Entity[A], B <: PrimaryKey[A, B]] {
      this: B =>
    }

    trait Entity[A <: Entity[A]] {
      this: A =>

      type K <: PrimaryKey[A, K]
      val id: K
    }

    class FooKey[A <: Foo[A]] extends PrimaryKey[A, FooKey[A]]

    trait Foo[A <: Foo[A]] extends Entity[A] {
      this: A =>
      override type K = FooKey[A]
    }

    class FooImpl(val id: FooKey[FooImpl]) extends Foo[FooImpl]
  }

答案 1 :(得分:1)

每行包含一个单独的json对象,解析并将它们存储在列表中:

with open('tweets.json', 'r') as tweet_data:
    values = [json.loads(line) for line in tweet_data.readlines() 
              if not line.strip()]

答案 2 :(得分:1)

这是代码。您首先需要安装Pandas。如果解决方案对您有所帮助,请在绿色答案上标记此答案。

 render() {
  return (
  <div>
    <button onClick={ () => this.setState(state => { object: { ...state.object, a: 5 }})}>change value a </button>
    <button onClick={ () => this.setState(state => { object: { ...state.object, b: 2 }})}>change value b </button>
    <button onClick={ () => this.setState(state => { object: { ...state.object, c: 5 }})}>change value c </button>
  </div>
  <p>{this.state.object.a}</p>
  <p>{this.state.object.b}</p>
  <p>{this.state.object.c}</p>
  );
  }

因此您可以看到import json import pandas as pd with open('tweets.json') as json_file: data_list = json.load(json_file) tweet_data_frame = pd.DataFrame.from_dict(data_list) print(tweet_data_frame) print(data_list) 打印出一个列表,print(data_list)打印出数据框。

如果要查看这些变量的类型,只需使用type()print(tweet_data_frame)

重要:我试图告诉您的是,您的JSON文件格式错误,并且存在很多错误。如果您有更多的JSON对象,则它们必须位于数组print(type(data_list))中。您的JSON文件有错误。尝试使用其他JSON文件。