我要用“我是一个使用可怕的'数据库'的菜鸟”来说明我要说的话。下面是我的csv当前json输出的结构(下面的概述)。基本上我要做的是将A列(信息技术)中的“组”附加到每个“数据”字典,因此有一个“组”键:看起来像“组”的值:“信息技术”。然后,第5行(非必需消费品)下的所有内容都将具有“群体”:“非必需消费品”键,价值。
{
"stocks": [
{
"data": {
"portfolio_average_weight": "5.985"
"portfolio_total_return": "27.948"
},
"name": "Google Inc "
},
{
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292"
},
"name": "Mastercard Inc "
}]
}
Column A Column B Column C Column D
Information Technology [blank cell] [blank cell] [blank cell]
[blank cell] Google 5.985 27.948
[blank cell] Mastercard 2.896 24.292
Consumer Discretionary [blank cell] [blank cell] [blank cell]
[blank cell] xxxxxx xxxxxxxxx xxxxxxxxx
这是我目前的代码:
with open('test.csv', 'rU') as csvfile:
lines = csv.reader(csvfile)
for line in lines:
elif line[0] == "" and line[1] != "":
data = test_two_level(line)
bottom_level = {
"name": line[2],
"data": data}
def test_two_level(line):
data = {
"portfolio_average_weight":line[3],
"portfolio_total_return":line[4]}
return data
我希望最终输出看起来像:
{
"stocks": [
{
"data": {
"portfolio_average_weight": "5.985",
"portfolio_total_return": "27.948",
"group": "Information Technology"
},
"name": "Google Inc "
},
{
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292",
"group": "Information Technology"
},
"name": "Mastercard Inc "
}]
}
以下是csv:
Information Technology,,,
,Google Inc ,5.985,27.948
,Mastercard Inc ,2.896,24.292
Consumer Discretionary,,,
答案 0 :(得分:1)
我倾向于使用csv.DictReader
而不是csv.reader
因为生成的代码更容易阅读,而且每行读入字典也会使代码更加统一 - 尤其是在处理JSON对象时,它们本身通常由一个或多个词典组成。
import csv, json
with open('csv_to_json_test.csv', 'rb') as csvfile:
csvfields = 'group', 'name', 'average_weight', 'total_return'
reader = csv.DictReader(csvfile, fieldnames=csvfields)
database = {}
stocks = database['stocks'] = [] # initialize item to be parsed
group = None
for row in reader:
if row['group']:
group = row['group']
else:
stocks.append(
{
'data': {
"portfolio_average_weight": row['average_weight'],
"portfolio_total_return": row['total_return']
},
'name': row['name'].rstrip(), # strips trailing spaces
'group': group,
}
)
print 'database =',
print json.dumps(database, indent=4)
输出:
database = {
"stocks": [
{
"group": "Information Technology",
"data": {
"portfolio_average_weight": "5.985",
"portfolio_total_return": "27.948"
},
"name": "Google Inc"
},
{
"group": "Information Technology",
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292"
},
"name": "Mastercard Inc"
}
]
}