尝试使用熊猫读取JSON文件时发生错误

时间:2019-11-18 19:04:36

标签: python json python-3.x pandas twitter

我正在使用python(pandas)读取带有原始推文的JSON文件,但出现以下错误:

  

ValueError:解码数组值(2)时发现意外字符

我将不胜感激。

编辑:这是JSON的示例

  

{“ created_at”:“ Sat Nov 16 14:15:52 +0000 2019”,“ id”:1195707056365461505,“ id_str”:“ 1195707056365461505”,“ text”:“此处有任何阿森纳红色成员,请给我请...有几个问题\ ud83d \ ude05 \ ud83e \ udd14“,”源“:” \ u003ca href = \“ http://twitter.com/download/iphone \” rel = \“ nofollow \” \适用于iPhone的u003eTwitter \ u003c / a \ u003e“,”截断“:false,” in_reply_to_status_id_str“:null,” in_reply_to_status_id_str“:null,” in_reply_to_user_id“:null,” in_reply_to_user_id_str“:” null,“ :{“ id”:974846850,“ id_str”:“ 974846850”,“名称”:“ Rico Rodrigo”,“ screen_name”:“ DatGuyTy_online”,“位置”:“ Brum”,“ url”:null,“描述” :“有抱负的会计师x阿森纳爱好者x动漫迷”,“翻译类型”:“无”,“受保护”:false,“已验证”:false,“关注者人数”:647,“朋友数”:901,“列表数”:9, “ favourites_count”:24989,“ statuses_count”:24628,“ created_at”:“ Tue Nov 27 22:25:31 +0000 2012”,“ utc_offset”:null,“ time_zone”:null,“ geo_enabled”:true,“ lang “:null,”启用了供稿人“:fa lse,“ is_translator”:false,“ profile_background_color”:“ C0DEED”,“ profile_background_image_url”:“ http://abs.twimg.com/images/themes/theme1/bg.png”,“ profile_background_image_url_https”:“ https:/ /abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color “:” 333333“,” profile_use_background_image“:true,” profile_image_url“:” http://pbs.twimg.com/profile_images/1071377159682514945/Np4nGX5m_normal.jpg“,” profile_image_url_https“:” https://pbs.twimg.com /profile_images/1071377159682514945/Np4nGX5m_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/974846850/1554183093","default_profile":true,"default_profile_image":false,"following":null, “ follow_request_sent”:null,“ notifications”:null},“ geo”:null,“ coordinates”:null,“ place”:null,“ contributors”:null,“ is_quote_status”:false,“ quote_count”:0,“ Reply_count“:0,” retweet_count“:0,” favorite_count“:0,”实体“:{” hashtags“:[],” urls“:[],” user_mentions“:[],”符号“:[]},”收藏夹“:false,”转发“:false,” filter_level“:”低”,“ lang”:“ en”,“ timestamp_ms”:“ 1573913752057”}

这是我用来读取文件的代码:

import numpy as np 
import pandas as pd 
import re 
import matplotlib.pyplot as plt 
import json 
import os

tweet_file = 'raw_data.json' 
tweets = pd.read_json(tweet_file, convert_dates=True, lines=True, encoding='utf-8')

0 个答案:

没有答案