Python CSV到JSON W /数组输出

时间:2017-01-05 08:07:01

标签: python arrays json csv output

我尝试从CSV中获取数据并将其放在JSON格式的顶级数组中。

目前我正在运行此代码:

import csv
import json

csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("ID","Artist","Song", "Artist")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

CSV文件的格式如下:

| 1 | Empire of the Sun | We Are The People | Walking on a Dream |
| 2 | M83 | Steve McQueen | Hurry Up We're Dreaming | 

其中 =第1列:ID |第2栏:艺术家|第3栏:歌曲|第4栏:专辑

获得此输出:

    {"Song": "Empire of the Sun", "ID": "1", "Artist": "Walking on a   Dream"}
    {"Song": "M83", "ID": "2", "Artist": "Hurry Up We're Dreaming"}

我试图让它看起来像这样:

{             
    "Music": [

    {
        "id": 1,
        "Artist": "Empire of the Sun",
        "Name": "We are the People",
        "Album": "Walking on a Dream"
    },
    {
        "id": 2,
        "Artist": "M83",
        "Name": "Steve McQueen",
        "Album": "Hurry Up We're Dreaming"
    },
    ]
}

4 个答案:

答案 0 :(得分:2)

Pandas非常简单地解决了这个问题。首先阅读文件

std::vector<std::vector<Something> > v(outer);
std::vector<std::vector<Something> >::iterator i = v.begin(), end = v.end();
while (i != end)
{
     i->reserve(inner);
     ++i;
}

现在你有一些选择。从中获取正确的json文件的最快方法就是

import pandas

df = pandas.read_csv('music.csv', names=("id","Artist","Song", "Album"))

输出:

df.to_json('file.json', orient='records')

这并不能满足您在&#34; Music&#34;中所需的一切要求。对象或字段的顺序,但它确实有简洁的好处。

要将输出包装在Music对象中,我们可以使用[{"id":1,"Artist":"Empire of the Sun","Song":"We Are The People","Album":"Walking on a Dream"},{"id":2,"Artist":"M83","Song":"Steve McQueen","Album":"Hurry Up We're Dreaming"}]

to_dict

输出:

import json
with open('file.json', 'w') as f:
    json.dump({'Music': df.to_dict(orient='records')}, f, indent=4)

我建议您重新考虑坚持字段的特定订单,因为JSON specification明确说明&#34;对象是一组无序名称/值对&#34 ; (强调我的)。

答案 1 :(得分:1)

好的,这是未经测试的,但请尝试以下方法:

import csv
import json
from collections import OrderedDict

fieldnames = ("ID","Artist","Song", "Artist")

entries = []
#the with statement is better since it handles closing your file properly after usage.
with open('music.csv', 'r') as csvfile:
    #python's standard dict is not guaranteeing any order, 
    #but if you write into an OrderedDict, order of write operations will be kept in output.
    reader = csv.DictReader(csvfile, fieldnames)
    for row in reader:
        entry = OrderedDict()
        for field in fieldnames:
            entry[field] = row[field]
        entries.append(entry)

output = {
    "Music": entries
}

with open('file.json', 'w') as jsonfile:
    json.dump(output, jsonfile)
    jsonfile.write('\n')

答案 2 :(得分:0)

你的逻辑顺序错误。 myMock.VerifyAllExpectations();旨在以递归方式将单个对象转换为JSON。因此,在调用jsondump之前,您应始终考虑构建单个对象。

首先将其收集到一个数组中:

dumps

然后将其放入music = [r for r in reader]

dict

然后转储到JSON:

result = {'Music': music}

或全部在一行:

json.dump(result, jsonfile)

“订购”JSON

如果您真的关心JSON中对象属性的顺序(即使您不应该这样),则不应使用json.dump({'Music': [r for r in reader]}, jsonfile) 。相反,请使用常规阅读器并自己创建DictReader

OrderedDict

或者再次在一行中:

from collections import OrderedDict

...

reader = csv.Reader(csvfile)
music = [OrderedDict(zip(fieldnames, r)) for r in reader]

其他

此外,为文件使用上下文管理器以确保它们正确关闭:

json.dump({'Music': [OrderedDict(zip(fieldnames, r)) for r in reader]}, jsonfile)

答案 3 :(得分:0)

  

它没有按照我希望的顺序写入JSON文件

csv.DictReader类返回Python dict对象。 Python字典是无序集合。您无法控制其演示顺序。

Python确实提供了OrderedDict,如果您避免使用csv.DictReader(),则可以使用。{/ p>

  

它完全跳过了歌曲名称。

这是因为该文件实际上不是CSV文件。特别是,每一行以字段分隔符开始和结束。我们可以使用.strip("|")来解决此问题。

  

我需要将所有这些数据输出到名为“Music”

的数组中

然后程序需要创建一个以"Music"为键的字典。

  

我需要在每个艺术家信息后都有逗号。在输出中,我得到了

此问题是因为您多次致电json.dumps()。如果需要有效的JSON文件,则只应调用一次。

试试这个:

import csv
import json
from collections import OrderedDict


def MyDictReader(fp, fieldnames):
    fp = (x.strip().strip('|').strip() for x in fp)
    reader = csv.reader(fp, delimiter="|")
    reader = ([field.strip() for field in row] for row in reader)
    dict_reader = (OrderedDict(zip(fieldnames, row)) for row in reader)
    return dict_reader

csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Album")
reader = MyDictReader(csvfile, fieldnames)
json.dump({"Music": list(reader)}, jsonfile, indent=2)