将JSON转换为CSV

时间:2017-05-25 22:02:06

标签: python

import json
import csv
from watson_developer_cloud import NaturalLanguageUnderstandingV1
import watson_developer_cloud.natural_language_understanding.features.v1 as \
    features


natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2017-02-27',
    username='b6dd1781-02e4-4dca-a706-05597d574221',
    password='c3ked6Ttmmc1')

response = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! '
         'Superman fears not Banner, but Wayne.',
    features=[features.Entities()])

response1 = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! '
         'Superman fears not Banner, but Wayne.',
    features=[features.Keywords()])

#print response.items()[0][1][1]
make= json.dumps(response, indent=2)
make1= json.dumps(response1, indent=2)
print make
print make1

x = json.loads(make)

f = csv.writer(open("Entities.csv", "wb+"))


f.writerow(["relevance", "text", "type", "count"])

for x1 in x:
    f.writerow([x1['relevance'],
                x1['text'],
                x1['type'],
                x1['count']])

上面的make变量包含一个必须转换为CSV的JSON,同时这样做我得到一个TypeError类型的错误:字符串索引必须是整数。实际的问题是我无法通过实体并进入键值对,有人可以告诉我在这里可以做些什么吗?

JSON的结构

{
  "entities": [
    {
      "relevance": 0.931351,
      "text": "Bruce Banner",
      "type": "Person",
      "count": 3
    },
    {
      "relevance": 0.288696,
      "text": "Wayne",
      "type": "Person",
      "count": 1
    }
  ],
  "language": "en"
}

2 个答案:

答案 0 :(得分:0)

如果将json结构和数据转储到文件中 - 您可以使用此脚本将键:值处理为CSV文件:

# -*- coding: utf-8 -*-
"""
Created on Fri May 26 01:24:44 2017

@author: ITZIK CHAIMOV
"""
import csv


labels = []     #prepare empty list of labels and values
values = []

fin = open('dataFile.json', 'r')    #assuming you have dumped the data into a json file (as you showed at the example)
#numberOfLines = fin.readlines()
#for line in range(numberOfLines):
buffer = fin.readline()
buffer = fin.readline()
while (buffer!=''):
    while not(buffer.__contains__('"en"')):
        if  buffer.__contains__('{'):
            buffer = fin.readline()
            while not(buffer.__contains__('}')):
                labels.append(buffer.split(':')[0].strip())
                values.append(buffer.split(':')[1].strip())
                buffer = fin.readline()
        buffer=fin.readline()
    break
fin.close()
n=size(labels)
firstLabel = labels[0]
i=0
for lbl in labels:
    if ((firstLabel == lbl) & (i!=0)):
        break
    i+=1

tbl = []
tbl.append(labels[0:i])
for j in range(int(n/i)):
    tbl.append(values[j*i:(j+1)*i])


fout = open('testfile.csv', 'w')
csv_write = csv.writer(fout)
csv_write.writerows(tabl)
fout.close()

CSV file shown at Excel - the '/" signs can be removed

答案 1 :(得分:-1)

x1返回结构x的键。要访问与每个密钥关联的值,您需要执行x[x1],否则,您正在'relevance'中查找名为x1的索引,该索引是字符串类型的键。

x包含整个JSON结构。您只对由“实体”键索引的列表(由单个字典组成)感兴趣。所以你首先只访问它,然后遍历每个键值对。

x1 = x['entities'][0]
f.writerow([x1['relevance'],
                x1['text'],
                x1['type'],
                x1['count']])

第二个键是“language”,它返回单个字符串'en',而不是字典。