Question

我正在尝试将json调整为csv解析我在GitHub上找到了here。代码设置为从终端运行，定义了3个参数：节点， json文件的路径，创建csv的路径

我正在尝试修改代码，以便我可以调用它来运行我正在编写的另一个python脚本。根据我从终端运行的模块了解到它们使用if __name__ == "__main__":但是如果我想从另一个python脚本运行它，我需要创建一个像def main()这样的定义来调用，对吗？

import sys
import json
import csv

# https://github.com/vinay20045/json-to-csv
##
# Convert to string keeping encoding in mind...
##


def to_string(s):
    try:
        return str(s)
    except:
        # Change the encoding type if needed
        return s.encode('utf-8')

def reduce_item(key, value):
    global reduced_item

    # Reduction Condition 1
    if type(value) is list:
        i = 0
        for sub_item in value:
            reduce_item(key + '_' + to_string(i), sub_item)
            i = i + 1

    # Reduction Condition 2
    elif type(value) is dict:
        sub_keys = value.keys()
        for sub_key in sub_keys:
            reduce_item(key + '_' + to_string(sub_key), value[sub_key])

    # Base Condition
    else:
        reduced_item[to_string(key)] = to_string(value)

# the module I created and moved the contents of __main__ to here
def main(node, json_file_path, csv_file_path):
    # Reading arguments
    # node = sys.argv[1]
    # json_file_path = sys.argv[2]
    # csv_file_path = sys.argv[3]

    fp = open(json_file_path, 'r')
    json_value = fp.read()
    raw_data = json.loads(json_value)
    print(raw_data['tag'])

    try:
        data_to_be_processed = raw_data[node]
    except:
        data_to_be_processed = raw_data

    processed_data = []
    header = []
    for item in data_to_be_processed:
        reduced_item = {}
        reduce_item(node, item)

        header += reduced_item.keys()

        processed_data.append(reduced_item)

    header = list(set(header))
    header.sort()

    with open(csv_file_path, 'a') as f:
        writer = csv.DictWriter(f, header, quoting=csv.QUOTE_ALL)
        writer.writeheader()
        for row in processed_data:
            writer.writerow(row)

    print ("Just completed writing csv file with %d columns" % len(header))


# if __name__ == "__main__":
#     if len(sys.argv) != 4:
#         print ("\nUsage: python json_to_csv.py <node_name> <json_in_file_path> <csv_out_file_path>\n")
#     else:
#         # Reading arguments
#     main(sys.argv)

这是我用来调用jsontocsv2.py的其他python脚本：

import jsontocsv2
import json

filename = 'test2.csv'

SourceFile = 'carapi.json'

jsontocsv2.main('cars', SourceFile, filename)

以下是我遇到的错误：

Traceback (most recent call last):
  File "/Users/Documents/Projects/test.py", line 8, in <module>
    jsontocsv2.main('cars', SourceFile, filename)
  File "/Users/Documents/Projects/jsontocsv2.py", line 84, in main
    reduce_item(node, item)
  File "/Users/Documents/Projects/jsontocsv2.py", line 57, in reduce_item
    reduce_item(key + '_' + to_string(sub_key), value[sub_key])
  File "/Users/Documents/Projects/jsontocsv2.py", line 61, in reduce_item
    reduced_item[to_string(key)] = to_string(value)
NameError: name 'reduced_item' is not defined

任何人都可以帮助指出如何解决这个问题的正确方向吗？我在堆栈溢出上做了很多搜索，发现了类似问题的帖子，但是我还没弄清楚如何让它工作。

Answer 1

我能够让代码按照我想要的方式运行。

我所要做的就是将global reduced_item()语句从def reduced_item()函数移动到我创建的def main(node, json_file_path, csv_file_path)函数。如果声明它的全局变量没有定义，那么我不确定为什么这会起作用。

为什么定义一些Global通常不是一个好主意？如果你们有关于如何更好地做这件事的建议，我可以提供指导。感谢您的帮助。

模块问题传递args

1 个答案: