抓住pandas df.to_json()异常

时间:2018-03-28 23:57:30

标签: python json pandas dataframe exception

我正在尝试确定在执行pandas DataFrame.to_json()方法时出现故障的原因。 DataFrame有效,但它非常大(大约1,000,000条记录)。

以下是我的代码,其中predictions是我的DataFrame:

try:
    predictions.to_json(write_file, orient='records', lines=True)
except EOFError as eoferr:
    print(eoferr)
    sys.exit('\nUnable to write file (%s)! EOFError. Exiting...' % write_file)
except IOError as ioerr:
    print(ioerr)
    sys.exit('\nUnable to write file (%s)! Permissions problem Exiting...' % write_file)
except Exception as e:
    print(e)
    sys.exit('\nUnable to write file (%s)! Unknown exception. Exiting...' % write_file)

现在,我正在提出Unknown exception. Exiting...异常。提前谢谢!

1 个答案:

答案 0 :(得分:1)

不是解决方案,而是解决方法

相信这是CREATE TABLE mobile_repo (app_id string, platform string, app_name string, developer_name string, rating double, category_csv string, num_ratings bigint, num_recommendations bigint, num_downloads bigint, size_mbs double, price double, in_app_purchase_confidence double, content_maturity string, badges_csv string, bundle_id string, rfi_brand_rating string, category_id_csv string, country_code string, permissions string) row format delimited fields terminated by '\t' location 'specify a HDFS location' ; 的内存问题 - 机器会挂起,试图将~5M记录x 1000列的DataFrame转换为JSON文件,并退出引发一般DataFrame.to_json()

解决方法是使用Exception,然后使用DataFrame.to_dict()

json.dump()

这适用于任何大小的DataFrame(目前为止)。与此同时,我将在Pandas回购中提出错误/问题。