Json对象(输出): [424783,[198184],[605],[644],[296],[2048],424694,[369192],[10139], [152532],[397538],[1420]]
<<<已删除代码>>>
所需的输出:
424783,198184
424783,605
424783,644
424783,296
424783,2048
424694,369192
424694,10139
424694,152532
424694,397538
424694,1420
答案 0 :(得分:1)
从您的数据看来,应该将非方括号的项目视为第一列的值(即 key ),将方括号的项目视为第二列的值,使用< em> key 在它们之前。您可以纯粹以程序方式进行此操作:
getFilteredList
哪个应该产生包含以下内容的import csv
import json
src = '''[424783, [198184], [605], [644], [296], [2048],
424694, [369192], [10139], [152532], [397538], [1420]]'''
with open('output.csv', 'w', newline='') as f: # Python 2.x: open('output.csv', 'wb')
writer = csv.writer(f) # create a simple CSV writer
current_key = None # a container for the last seen / cached 'key'
for element in json.loads(src): # parse the structure and iterate over it
if isinstance(element, list): # if the element is a 'list'
writer.writerow((current_key, element[0])) # write to csv w/ cached key
else:
current_key = element # cache the element as the key for following entries
:
424783,198184 424783,605 424783,644 424783,296 424783,2048 424694,369192 424694,10139 424694,152532 424694,397538 424694,1420
答案 1 :(得分:0)
itertools.groupby
对于Python初学者来说有点挑战,但是在浏览一系列项目并分组处理时非常方便。在这种情况下,我们按不是Python列表的项目进行分组。
从每组嵌套的int中,我们将在累加器列表中创建一个或多个条目。
一旦加载了累加器列表,下面的代码将打印出结果,轻松转换为写入文件。
import ast
from itertools import groupby
from collections import namedtuple
# this may be JSON, but it's also an ordinary Python nested list of ints, so safely parseable using
# ast.literal_eval()
text = "[424783, [198184], [605], [644], [296], [2048], 424694, [369192], [10139], [152532], [397538], [1420]]"
items = ast.literal_eval(text)
# a namedtuple to hold each record, and a list to accumulate them
DataRow = namedtuple("DataRow", "old_id new_id")
accumulator = []
# use groupby to process the entries in groups, depending on whether the items are lists or not
key = None
for is_data, values in groupby(items, key=lambda x: isinstance(x, list)):
if not is_data:
# the sole value the next record key
key = list(values)[0]
else:
# the values are the collection of lists until the next key
accumulator.extend(DataRow(key, v[0]) for v in values)
# dump out as csv
for item in accumulator:
print("{old_id},{new_id}".format_map(item._asdict()))
打印:
424783,198184
424783,605
424783,644
424783,296
424783,2048
424694,369192
424694,10139
424694,152532
424694,397538
424694,1420
答案 2 :(得分:0)
我认为使用itertools.groupby()
是一个很好的方法,因为对项目进行分组是完成所需内容的主要工作。
这是一种相当简单的使用方法:
import csv
from itertools import groupby
import json
json_src = '''[424783, [198184], [605], [644], [296], [2048],
424694, [369192], [10139], [152532], [397538], [1420]]'''
def xyz():
return json.loads(json_src)
def abc():
json_processed = xyz()
output_filename = 'y.csv'
with open(output_filename, 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
for is_list, items in groupby(json_processed, key=lambda v: isinstance(v, list)):
if is_list:
new_ids = [item[0] for item in items]
else:
old_id = next(items)
continue
for new_id in new_ids:
writer.writerow([old_id, new_id])
abc()
产生的csv文件内容:
424783,198184
424783,605
424783,644
424783,296
424783,2048
424694,369192
424694,10139
424694,152532
424694,397538
424694,1420