嵌套的JSON值导致“ TypeError:类型'int64'的对象不可JSON序列化”

时间:2019-01-07 21:37:51

标签: python json pandas python-requests

在这里希望得到一些帮助。全文,这是我的第一个“有目的” Python脚本。在此之前,我只涉猎了一点,说实话,我仍然在学习,所以也许我在这里跳得太早了。

长话短说,一直在解决各种类型的不匹配或只是一般的缩进问题(亲爱的python上帝对此不宽容)。

我想我快完成了,但还有最后几个问题。他们中的大多数人似乎也来自同一部分。该脚本仅意味着获取具有3列的csv文件,并使用该文件根据第一列(iOS或Android)发送请求。问题是当我创建要发送的正文时... 这是代码(为便于发布,省略了一些标记):

#!/usr/bin/python
# -*- coding: utf-8 -*-

import requests
import json
import pandas as pd
from tqdm import tqdm
from datetime import *
import uuid
import warnings
from math import isnan
import time


## throttling based on AF's 80 request per 2 minute rule
def throttle():
    i = 0
    while i <= 3:
        print ("PAUSED FOR THROTTLING!" + "\n" + str(3-i) + " minutes remaining")
        time.sleep(60)
        i = i + 1
        print (i)
    return 0

## function for reformating the dates
def date():
    d = datetime.utcnow()  # # <-- get time in UTC
    d = d.isoformat('T') + 'Z'
    t = d.split('.')
    t = t[0] + 'Z'
    return str(t)


## function for dealing with Android requests
def android_request(madv_id,mtime,muuid,android_app,token,endpoint):
    headers = {'Content-Type': 'application/json', 'Accept': 'application/json'}

    params = {'api_token': token }

    subject_identities = {
        "identity_format": "raw",
        "identity_type": "android_advertising_id", 
        "identity_value": madv_id
    }

    body = {
        'subject_request_id': muuid,
        'subject_request_type': 'erasure',
        'submitted_time': mtime,
        'subject_identities': dict(subject_identities),
        'property_id': android_app
        }
    body = json.dumps(body)
    res = requests.request('POST', endpoint, headers=headers,
                           data=body, params=params)
    print("android " + res.text)

## function for dealing with iOS requests
def ios_request(midfa, mtime, muuid, ios_app, token, endpoint):
    headers = {'Content-Type': 'application/json',
               'Accept': 'application/json'}
    params = {'api_token': token}

    subject_identities = {
        'identity_format': 'raw',
        'identity_type': 'ios_advertising_id',
        'identity_value': midfa,
    }
    body = {
        'subject_request_id': muuid,
        'subject_request_type': 'erasure',
        'submitted_time': mtime,
        'subject_identities': list(subject_identities),
        'property_id': ios_app,
        }

    body = json.dumps(body)
    res = requests.request('POST', endpoint, headers=headers, data=body, params=params)
    print("ios " + res.text)

## main run function. Determines whether it is iOS or Android request and sends if not LAT-user
def run(output, mdf, is_test):

  # # assigning variables to the columns I need from file

    print ('Sending requests! Stand by...')
    platform = mdf.platform
    device = mdf.device_id

    if is_test=="y":
        ios = 'id000000000'
        android = 'com.tacos.okay'
        token = 'OMMITTED_FOR_STACKOVERFLOW_Q'
        endpoint = 'https://hq1.appsflyer.com/gdpr/stub'
    else:
        ios = 'id000000000'
        android = 'com.tacos.best'
        token = 'OMMITTED_FOR_STACKOVERFLOW_Q'
        endpoint = 'https://hq1.appsflyer.com/gdpr/opengdpr_requests'


    for position in tqdm(range(len(device))):
        if position % 80 == 0 and position != 0: 
            throttle()
        else:
            req_id = str(uuid.uuid4())
            timestamp = str(date())

            if platform[position] == 'android' and device[position] != '':
                android_request(device[position], timestamp, req_id, android, token, endpoint)
                mdf['subject_request_id'][position] = req_id

            if platform[position] == 'ios' and device[position] != '':
                ios_request(device[position], timestamp, req_id, ios, token, endpoint)
                mdf['subject_request_id'][position] = req_id

            if 'LAT' in platform[position]:
                mdf['subject_request_id'][position] = 'null'
                mdf['error status'][position] = 'Limit Ad Tracking Users Unsupported. Device ID Required'

            mdf.to_csv(output, sep=',', index = False, header=True)
        # mdf.close()

    print ('\nDONE. Please see ' + output 
        + ' for the subject_request_id and/or error messages\n')

## takes the CSV given by the user and makes a copy of it for us to use
def read(mname):
    orig_csv = pd.read_csv(mname)
    mdf = orig_csv.copy()

    # Check that both dataframes are actually the same
    # print(pd.DataFrame.equals(orig_csv, mdf))

    return mdf

## just used to create the renamed file with _LOGS.csv
def rename(mname):
    msuffix = '_LOG.csv'
    i = mname.split('.')
    i = i[0] + msuffix
    return i

## adds relevant columns to the log file
def logs_csv(out, df):
    mdf = df
    mdf['subject_request_id'] = ''
    mdf['error status'] = ''
    mdf['device_id'].fillna('')
    mdf.to_csv(out, sep=',', index=None, header=True)

    return mdf

## solely for reading in the file name from the user. creates string out of filename
def readin_name():
    mprefix = input('FILE NAME: ')
    msuffix = '.csv'
    mname = str(mprefix + msuffix)
    print ('\n' + 'Reading in file: ' + mname)
    return mname

def start():
    print ('\nWelcome to GDPR STREAMLINE')
    # # blue = OpenFile()
    testing = input('Is this a test? (y/n) : ')

    # return a CSV
    name = readin_name()
    import_csv = read(name)
    output_name = rename(name)

    output_file = logs_csv(output_name, import_csv)

    run( output_name, output_file, testing)


  # # print ("FILE PATH:" + blue)

## to disable all warnings in console logs

warnings.filterwarnings('ignore')
start()

这是错误堆栈跟踪:

Reading in file: test.csv
Sending requests! Stand by...
  0%|                                                                                                                                                        | 0/384 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "a_GDPR_delete.py", line 199, in <module>
    start()
  File "a_GDPR_delete.py", line 191, in start
    run( output_name, output_file, testing)
  File "a_GDPR_delete.py", line 114, in run
    android_request(device[position], timestamp, req_id, android, token, endpoint)
  File "a_GDPR_delete.py", line 57, in android_request
    body = json.dumps(body)
  File "/Users/joseph/anaconda3/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Users/joseph/anaconda3/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Users/joseph/anaconda3/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/Users/joseph/anaconda3/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'int64' is not JSON serializable

TL; DR: 在具有另一个嵌套JSON的JSON上调用时,出现typeError。我已经确认嵌套的JSON是问题所在,因为如果我删除“ subject_identities”部分,则该部分可以编译并起作用...但是我使用的API需要这些值,因此如果没有该部分,该操作实际上将不起作用。 / p>

这又是相关​​的代码(在我最初使用WAS的版本中):

def android (madv_id, mtime, muuid):
  headers = {
      "Content-Type": "application/json",
      "Accept": "application/json"
  }
  params = {
      "api_token": "OMMITTED_FOR_STACKOVERFLOW_Q"
  }
  body = {
     "subject_request_id": muuid, #muuid, 
     "subject_request_type": "erasure", 
     "submitted_time": mtime, 
     "subject_identities": [
        { "identity_type": "android_advertising_id", 
           "identity_value": madv_id, 
           "identity_format": "raw" }
        ], 
     "property_id": "com.tacos.best" 

  } 
  body = json.dumps(body) 
  res = requests.request("POST", 
  "https://hq1.appsflyer.com/gdpr/opengdpr_requests", 
  headers=headers, data=body, params=params)

我感觉自己已经接近这项工作。早期,我有一个简单得多的版本可以工作,但我改写了它,使其更具动态性,并使用了较少的硬编码值(以便最终可以将其应用于我正在使用的任何应用程序,而不仅仅是这两个应用程序)。

请很好,我是python的新手,并且对一般的编码也很生疏(因此尝试做这样的项目)

2 个答案:

答案 0 :(得分:0)

您可以像这样检查numpy dtypes:

if hasattr(obj, 'dtype'):
    obj = obj.item()

这会将其转换为最接近的等效数据类型

编辑: 显然np.nan是JSON可序列化的,所以我从答案中删除了这个问题

答案 1 :(得分:0)

感谢大家在这里提供的如此迅速的帮助。显然,错误消息欺骗了我,因为@ juanpa.arrivillaga的修复程序做了一次调整。

正确的代码在以下部分上: DoSomething()

在这里: android_request(str(device[position]), timestamp, req_id, android, token, endpoint)

我显然不得不转换为字符串,即使这些值本来不是整数,并且看起来像这样ios_request(str(device[position]), timestamp, req_id, ios, token, endpoint)