读取相同标题的CSV并在dict中分开

时间:2016-05-17 11:03:17

标签: python

所以我在API请求中获取CSV,CSV的格式为 -

"id","loc_name","qty","loc_name","qty" "NM001","HLR","10","KBD","20" "NM003","KMG","15","SLD","25"

我想要这种格式的字典: {"NM001":{'HLR':'10', 'KBD':'20'}, "NM003":{"KMG": "15", "SLD": "25"}}

试过代码 -

field_names = next(csv.reader(csv_file,delimiter=",")) csv_file_handler = csv.DictReader(csv_file,delimiter=",",fieldnames=field_names) for each_row in islice(csv_file_handler, 1, None): print each_row

  • 此处csv_file是我在回复中收到的文件。

result = {'id': 'NM001', 'loc_name': 'KBD', 'qty': '20'} {'id': 'NM003', 'loc_name': 'SLD', 'qty': '25'}

csv.DictReader中的问题是它只返回最后一个值,因为标题是相同的。

3 个答案:

答案 0 :(得分:1)

Can't you do it like this:

result = dict()
for line in file:
    line = line.split('","')
    id = line[0][1:]
    l_n_1 = line[1]
    qty_1 = line[2]
    l_n_2 = line[3]
    qty_2 = line[4][:-1]

    if(id != "id"):
        result[id] = {l_n_1: qty_1, l_n_2: qty_2}

print(result)

This works and handles it like you want to.

I opened a local file, but it should be possible from an API request as well. My file looked like this:

"id","loc_name","qty","loc_name","qty"
"NM001","HLR","10","KBD","20"
"NM003","KMG","15","SLD","25"

答案 1 :(得分:1)

I am using pandas module for handling csv files. I would tackle this problem probably this way (although not the best way possible I assume and not the best way in pandas. But hey, it works.

import pandas as pd
# read the csv as DataFrame, probably there is a way to get it from
# the API directly without saving to a file
# specify header as the first row
df = pd.read_csv("test.csv", header=0)
# empty dict
d = {}
# iterate over lines, I use this way, but I don't like it in fact
for i, key_id in enumerate(df["id"]):
    # assign the values the way you want it
    # however you need to specify it by names
    # or indices
    d[key_id.strip()] = {df.loc[i, "loc_name"]:df.loc[i, "qty"],
                         df.loc[i, "loc_name.1"]:df.loc[i, "qty.1"]}
print(d)
#{'"NM003"': {'SLD': 25, 'KMG': 15}, '"NM001"': {'HLR': 10, 'KBD': 20}}

If you want this to work with additional columns (which must come in pair: key:val), then you can use df.ix[<row>,<col>] and iterate first over rows (as before) and then over columns (add if to add only non nan values):

for i, key_id in enumerate(df["id"]):
    # create empty dict
    d[key_id.strip()] = {}
    # a less python more C-like syntax
    # go through cols, skip the first and step is 2
    for j in range(1, len(df.columns), 2):
        # if there is some entry
        if not pd.isnull(df.ix[i,j]):
            d[key_id.strip()][df.ix[i, j]] = df.ix[i, j+1]

答案 2 :(得分:0)

Say you have

results = [{'id': 'NM001', 'loc_name': 'KBD', 'qty': '20'}, {'id': 'NM003', 'loc_name': 'SLD', 'qty': '25'}]

You can get what you want by:-

result_dicts = { res['id']: res for res in results }
for _, result_dict in result_dicts.items():
    result_dict.pop('id')