Python pandas to_json()格式无效

时间:2014-09-12 14:26:58

标签: python json python-3.x pandas

我遇到了JSON字符串输出问题。我正在使用制表符分隔的CSV文件,它看起来像这样:

date        time        loc_id  country name    sub1_id sub2_id type
2014-09-11  00:00:01    179     US      acmnj   269     382     ico 
2014-09-11  00:00:01    179     US      acmnj   269     382     ico 
2014-09-11  00:00:01    179     GB      acmnj   269     382     ico 
2014-09-11  00:00:01    179     US      acmnj   269     382     ico 
2014-09-11  00:00:02    179     GB      acmnj   269     383     ico 
2014-09-11  00:00:02    179     JP      acmnj   269     383     ico 

代码如下所示:

df = pd.read_csv('log.csv',sep='\t',encoding='utf-16')
count = df.groupby(['country','name','sub1_id','sub2_id','type']).size()
print(count.order(na_position='last',ascending=False).to_frame().to_json(orient='index'))

输出如下(前几行):

{"["US","acmnj",269,383,"ico"]":{"0":76174},"["US","acmnj",269,382,"ico"]":{"0":73609},"["IT","acmnj",269,383,"ico"]":{"0":54211},"["IT","acmnj",269,382,"ico"]":{"0":52398},"["GB","acmnj",269,383,"ico"]":{"0":41346},"["GB","acmnj",269,382,"ico"]":{"0":40140},"["US","acmnj",269,405,"ico"]":{"0":39482},"["US","acmnj",269,400,"ico"]":{"0":39303},"["US","popcdd",178,365,"ico"]":{"0":33168},"["IT","acmnj",269,400,"ico"]":{"0":33026},"["IT","acmnj",269,405,"ico"]":{"0":32824},"["IT","achrfb141",141,42,"ico"]":{"0":26986},"["GB","acmnj",269,405,"ico"]":{"0":25895},"["IN","acmnj",269,383,"ico"]":{"0":25647},"["GB","acmnj",269,400,"ico"]":{"0":25488...

我想在PHP中加载此输出但是当我尝试解码时我得到NULL。我使用JSON Validator来检查字符串,它无效。我也试过没有orient参数,但我得到了无效的JSON格式。

1 个答案:

答案 0 :(得分:4)

这似乎是熊猫的一个问题。我转载了你的错误。

DataFrame.to_json可以使用几个不同的 orient 参数:'split','records','index','columns'和'values'。

在您的情况下,似乎“拆分”,“记录”和“值”工作,但“索引”和“列”不起作用。

您可以使用json模块在python中快速测试:

df = pd.read_csv('log.csv',sep='\t',encoding='utf-16')
count = df.groupby(['country','name','sub1_id','sub2_id','type']).size()
f=count.order(ascending=False).to_frame()
json.loads(f.to_json(orient='index'))  # This failed for me
json.loads(f.to_json(orient='records')) #This worked