Question

我从Google趋势中下载了一个CSV文件，该文件以这种格式显示数据：

Top cities for golden globes
City,golden globes
New York (United States),100
Los Angeles (United States),91
Toronto (Canada),69

Top regions for golden globes
Region,golden globes
United States,100
Canada,91
Ireland,72
Australia,72

这些组中有3-4个由空格分隔。每个组的第一行包含我想用作键的文本，然后是我需要与该键关联的字典列表。有没有人对我可以使用的一些Python工具有任何建议？我对Python的CSV库没有太多运气。

上述CSV中我想要的输出如下所示：

{
"Top cities for golden globes" :
   {
      "New York (United States)" : 100,
      "Los Angeles (United States)" : 91,
      "Toronto (Canada)" : 69
   },
"Top regions for golden globes" :
   {
      "United States" : 100,
      "Canada" : 91,
      "Ireland" : 72,
      "Australia" : 72
   }
}

Answer 1

您的输入格式是如此令人期待，我会手动完成，没有CSV库。

import json
from collections import defaultdict

fh = open("yourfile.csv")
result = defaultdict(dict) #dictionary holding the data
current_key = "" #current category
ignore_next = False #flag to skip header

for line in fh:
    line = line.strip() #throw away newline
    if line == "": #line is empty
        current_key = ""
        continue
    if current_key == "": #current_key is empty
        current_key = line #so the current line is the header for the following data
        ignore_next = True
        continue
    if ignore_next: #we're in a line that can be ignored
        ignore_next = False
        continue
    (a,b) = line.split(",")
    result[current_key][a] = b
fh.close()

#pretty-print data
print json.dumps(result, sort_keys=True, indent=4)

Answer 2

我会尝试类似......：

row = []
dd = {}
with open('the.csv') as f:
    r = csv.reader(f)
    while True:
        if row:  # normal case, non-empty row
            d[row[0]] = row[1]
            row = next(r, None)
            if row is None: break
        else:  # row is empty at start and after blank line
            category = next(f, None)
            if category is None: break
            category = category.strip()
            next(r)  # skip headers row
            d = dd[category] = {}
            row = next(r, None)
            if row is None: break

现在，dd应该是你想要的词典，你可以随意json.dump。

如何使用Python将特定的CSV格式转换为JSON

2 个答案: