我正在使用python(pandas)读取带有原始推文的JSON文件,但出现以下错误:
ValueError:解码数组值(2)时发现意外字符
我将不胜感激。
编辑:这是JSON的示例
{“ created_at”:“ Sat Nov 16 14:15:52 +0000 2019”,“ id”:1195707056365461505,“ id_str”:“ 1195707056365461505”,“ text”:“此处有任何阿森纳红色成员,请给我请...有几个问题\ ud83d \ ude05 \ ud83e \ udd14“,”源“:” \ u003ca href = \“ http://twitter.com/download/iphone \” rel = \“ nofollow \” \适用于iPhone的u003eTwitter \ u003c / a \ u003e“,”截断“:false,” in_reply_to_status_id_str“:null,” in_reply_to_status_id_str“:null,” in_reply_to_user_id“:null,” in_reply_to_user_id_str“:” null,“ :{“ id”:974846850,“ id_str”:“ 974846850”,“名称”:“ Rico Rodrigo”,“ screen_name”:“ DatGuyTy_online”,“位置”:“ Brum”,“ url”:null,“描述” :“有抱负的会计师x阿森纳爱好者x动漫迷”,“翻译类型”:“无”,“受保护”:false,“已验证”:false,“关注者人数”:647,“朋友数”:901,“列表数”:9, “ favourites_count”:24989,“ statuses_count”:24628,“ created_at”:“ Tue Nov 27 22:25:31 +0000 2012”,“ utc_offset”:null,“ time_zone”:null,“ geo_enabled”:true,“ lang “:null,”启用了供稿人“:fa lse,“ is_translator”:false,“ profile_background_color”:“ C0DEED”,“ profile_background_image_url”:“ http://abs.twimg.com/images/themes/theme1/bg.png”,“ profile_background_image_url_https”:“ https:/ /abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color “:” 333333“,” profile_use_background_image“:true,” profile_image_url“:” http://pbs.twimg.com/profile_images/1071377159682514945/Np4nGX5m_normal.jpg“,” profile_image_url_https“:” https://pbs.twimg.com /profile_images/1071377159682514945/Np4nGX5m_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/974846850/1554183093","default_profile":true,"default_profile_image":false,"following":null, “ follow_request_sent”:null,“ notifications”:null},“ geo”:null,“ coordinates”:null,“ place”:null,“ contributors”:null,“ is_quote_status”:false,“ quote_count”:0,“ Reply_count“:0,” retweet_count“:0,” favorite_count“:0,”实体“:{” hashtags“:[],” urls“:[],” user_mentions“:[],”符号“:[]},”收藏夹“:false,”转发“:false,” filter_level“:”低”,“ lang”:“ en”,“ timestamp_ms”:“ 1573913752057”}
这是我用来读取文件的代码:
import numpy as np
import pandas as pd
import re
import matplotlib.pyplot as plt
import json
import os
tweet_file = 'raw_data.json'
tweets = pd.read_json(tweet_file, convert_dates=True, lines=True, encoding='utf-8')