Question

我实际上尝试在data.csv文件中编写JSON。我尝试从stackoverflow中解决以下问题： How do I write a Python dictionary to a csv file?

所以我想出了这些：

with open("data/dataGold.csv", 'w') as f:
    w = csv.DictWriter(f, ['data']['user']['repositories']['nodes'], extrasaction='ignore')
    w.writeheader()
    w.writerow(response)
    w.writerow([data['data']['user']['repositories']['nodes']['name'],
              data['data']['user']['repositories']['nodes']['forkCount'],
              data['data']['user']['repositories']['nodes']['issues']])

我的响应变量类型＆＃39; dict＆＃39;是：

{'data': {'user': {'name': 'Markus Goldstein',
                   'repositories': {'nodes': [{'forkCount': 0,
                                               'issues': {'totalCount': 0},
                                               'name': 'repache'},
                                              {'forkCount': 4,
                                               'issues': {'totalCount': 3},
                                               'name': 'nf-hishape'},
                                              {'forkCount': 4,
                                               'issues': {'totalCount': 7},
                                               'name': 'ip-countryside'},
                                              {'forkCount': 42,
                                               'issues': {'totalCount': 29},
                                               'name': 'bonesi'},
                                              {'forkCount': 13,
                                               'issues': {'totalCount': 3},
                                               'name': 'rapidminer-anomalydetection'},
                                              {'forkCount': 0,
                                               'issues': {'totalCount': 0},
                                               'name': 'rapidminer-studio'}]}}}}

有一个TypeError表示不允许索引。我认为这是因为我使用了[＆＃39;数据＆＃39;] [＆＃39;用户＆＃39;] [＆＃39;存储库＆＃39;] [＆＃39;节点＆＃39;]。

我发布上面链接的解决方案是因为没有嵌套的Dict / JSON。所以我不知道在我的情况下如何使用嵌套的Dict / JSON

所以我的目标是一个包含名称，forkCount和问题作为标题的CSV。接下来的行是不同回购的价值。

愿有人能帮助我，抱歉我的英语不好 - 谢谢！

Answer 1

Markus = {'data': {'user': {'name': 'Markus Goldstein',
                   'repositories': {'nodes': [{'forkCount': 0,
                                               'issues': {'totalCount': 0},
                                               'name': 'repache'},
                                              {'forkCount': 4,
                                               'issues': {'totalCount': 3},
                                               'name': 'nf-hishape'},
                                              {'forkCount': 4,
                                               'issues': {'totalCount': 7},
                                               'name': 'ip-countryside'},
                                              {'forkCount': 42,
                                               'issues': {'totalCount': 29},
                                               'name': 'bonesi'},
                                              {'forkCount': 13,
                                               'issues': {'totalCount': 3},
                                               'name': 'rapidminer-anomalydetection'},
                                              {'forkCount': 0,
                                               'issues': {'totalCount': 0},
                                               'name': 'rapidminer-studio'}]}}}}

with open('Markus.csv', 'w') as markus:
    print ('name,forkCount,issues', file=markus)
    for node in Markus['data']['user']['repositories']['nodes']:
        print ('{},{},{}'.format(node['name'], node['forkCount'], node['issues']['totalCount']), file=markus)

第一个print语句将标题行输出到csv文件。
for循环安排从字典中解压缩项目。
第二个print语句安排将每个解压缩的项目输出到csv文件。

结果就是这样。

name,forkCount,issues
repache,0,0
nf-hishape,4,3
ip-countryside,4,7
bonesi,42,29
rapidminer-anomalydetection,13,3
rapidminer-studio,0,0

Answer 2

鉴于以下情况应该可以正常工作，

1]我编写的额外循环会将您的结构更改为删除问题下的字典并在问题下存储 total_counts 的值>以便CSV清洁干净。

2]我在这里使用 deepcopy ，因为我不想修改原始数据结构，因此我没有使用引用，而是使用它的深度复制。

3]输入wt_csv[0].keys()到列表，因为.keys()函数在python 3中返回 dict_keys 而不是列表

import csv
import json
import copy

i_dict = {'data': {'user': {'name': 'Markus Goldstein',
                           'repositories': {'nodes': [{'forkCount': 0,
                                                       'issues': {'totalCount': 0},
                                                       'name': 'repache'},
                                                      {'forkCount': 4,
                                                       'issues': {'totalCount': 3},
                                                       'name': 'nf-hishape'},
                                                      {'forkCount': 4,
                                                       'issues': {'totalCount': 7},
                                                       'name': 'ip-countryside'},
                                                      {'forkCount': 42,
                                                       'issues': {'totalCount': 29},
                                                       'name': 'bonesi'},
                                                      {'forkCount': 13,
                                                       'issues': {'totalCount': 3},
                                                       'name': 'rapidminer-anomalydetection'},
                                                      {'forkCount': 0,
                                                       'issues': {'totalCount': 0},
                                                       'name': 'rapidminer-studio'}]}}}}


wt_csv = copy.deepcopy(i_dict['data']['user']['repositories']['nodes'])

for wc in wt_csv:
    wc['issues'] = wc['issues']['totalCount']

with open('dataGold.csv', 'w') as output_file:
    dict_writer = csv.DictWriter(output_file, fieldnames=list(wt_csv[0].keys()))
    dict_writer.writeheader()
    dict_writer.writerows(wt_csv)

如果不清楚，请在评论中告诉我。

Answer 3

因此，考虑到您正在分析RapidMiner的使用情况，您也可以选择使用RapidMiner文本处理：

这里是XML：

＆＃13;

<?xml version="1.0" encoding="UTF-8"?>
<process version="8.0.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="text:create_document" compatibility="7.5.000" expanded="true" height="68" name="Create Document" width="90" x="45" y="34">
        <parameter key="text" value="{&#10;  &quot;data&quot;: {&#10;    &quot;user&quot;: {&#10;      &quot;name&quot;: &quot;Markus Goldstein&quot;,&#10;      &quot;repositories&quot;: {&#10;        &quot;nodes&quot;: [&#10;          {&#10;            &quot;forkCount&quot;: 0,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 0&#10;            },&#10;            &quot;name&quot;: &quot;repache&quot;&#10;          },&#10;          {&#10;            &quot;forkCount&quot;: 4,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 3&#10;            },&#10;            &quot;name&quot;: &quot;nf-hishape&quot;&#10;          },&#10;          {&#10;            &quot;forkCount&quot;: 4,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 7&#10;            },&#10;            &quot;name&quot;: &quot;ip-countryside&quot;&#10;          },&#10;          {&#10;            &quot;forkCount&quot;: 42,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 29&#10;            },&#10;            &quot;name&quot;: &quot;bonesi&quot;&#10;          },&#10;          {&#10;            &quot;forkCount&quot;: 13,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 3&#10;            },&#10;            &quot;name&quot;: &quot;rapidminer-anomalydetection&quot;&#10;          },&#10;          {&#10;            &quot;forkCount&quot;: 0,&#10;            &quot;issues&quot;: {&#10;              &quot;totalCount&quot;: 0&#10;            },&#10;            &quot;name&quot;: &quot;rapidminer-studio&quot;&#10;          }&#10;        ]&#10;      }&#10;    }&#10;  }&#10;}"
        />
      </operator>
      <operator activated="true" class="text:json_to_data" compatibility="7.5.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34" />
      <connect from_op="Create Document" from_port="output" to_op="JSON To Data" to_port="documents 1" />
      <connect from_op="JSON To Data" from_port="example set" to_port="result 1" />
      <portSpacing port="source_input 1" spacing="0" />
      <portSpacing port="sink_result 1" spacing="0" />
      <portSpacing port="sink_result 2" spacing="0" />
    </process>
  </operator>
</process>

＆＃13;

Python 3 - 将JSON数据加载到我的.csv -file

3 个答案: