如何将CSV文件中的数据写入字典中而无需为Python导入CSV

时间:2019-12-16 12:27:04

标签: python dictionary

所以我将这些数据保存在一个文件中,显示为:

    Commodity, USA, Canada, Europe, China, India, Australia
    Wheat,61.7,27.2,133.9,121,94.9,22.9
    Rice Milled,6.3, -,2.1,143,105.2,0.8
    Oilseeds,93.1,19,28.1,59.8,36.8,5.7
    Cotton,17.3, -,1.5,35,28.5,4.6

顶部行是标题,第一列也是标题。破折号表示没有数据。

返回的字典格式如下:

  • 字典的键是国家/地区的名称。

  • 值是包含每个国家/地区数据的字典。这些词典的关键字是商品名称,值是该国为特定商品生产的数量。如果没有给定商品的数据(csv文件中的破折号),则该商品不得包含在字典中。例如,棉花不能在加拿大的字典中。请注意,“-”(破折号)不同于值0。

在上面的文件中,它应表示为:

{’Canada’:{’Wheat’:27.2,’Oilseeds’:19}, ’USA’:{’Wheat’:61.7, ’Cotton’:17.3,...}, ...}

对从哪里开始或做什么感到困惑。被困了几天

3 个答案:

答案 0 :(得分:0)

如果您没有导入pandas模块的问题,则可以按以下步骤完成

import pandas as pd
df = pd.read_csv('test2.csv', sep=',')
df.set_index('Commodity').to_json()

它将为您提供以下输出

{" USA":{"Wheat":61.7,"Rice Milled":6.3,"Oilseeds":93.1,"Cotton":17.3}," Canada":{"Wheat":"27.2","Rice Milled":" -","Oilseeds":"19","Cotton":" -"}," Europe":{"Wheat":133.9,"Rice Milled":2.1,"Oilseeds":28.1,"Cotton":1.5}," China":{"Wheat":121.0,"Rice Milled":143.0,"Oilseeds":59.8,"Cotton":35.0}," India":{"Wheat":94.9,"Rice Milled":105.2,"Oilseeds":36.8,"Cotton":28.5}," Australia":{"Wheat":22.9,"Rice Milled":0.8,"Oilseeds":5.7,"Cotton":4.6}}

答案 1 :(得分:0)

如果您真的希望它不带任何进口(无论如何),那么我能想到的最短的方法是:

with open('data_sample.txt') as f:
    lines = f.readlines()
    split_lines = [[i.strip() for i in l.split(',')] for l in lines]
    d = {}
    for i, line in enumerate(zip(*split_lines)):
        if i == 0:
            value_headers = line
            continue
        d[line[0]] = dict([(i,j) for i,j in zip(value_headers[1:], line[1:]) if j != '-' ])

print(d)

出局:

{'USA': {'Wheat': '61.7', 'Rice Milled': '6.3', 'Oilseeds': '93.1', 'Cotton': '17.3'}, 'Canada': {'Wheat': '27.2', 'Oilseeds': '19'}, 'Europe': {'Wheat': '133.9', 'Rice Milled': '2.1', 'Oilseeds': '28.1', 'Cotton': '1.5'}, 'China': {'Wheat': '121', 'Rice Milled': '143', 'Oilseeds': '59.8', 'Cotton': '35'}, 'India': {'Wheat': '94.9', 'Rice Milled': '105.2', 'Oilseeds': '36.8', 'Cotton': '28.5'}, 'Australia': {'Wheat': '22.9', 'Rice Milled': '0.8', 'Oilseeds': '5.7', 'Cotton': '4.6'}}

也许可以更好地使用zip等,但是应该给出一个总体思路

答案 2 :(得分:0)

如果您不打算导入任何模块,也可以这样做

data = {}

with open('data.txt') as f:
    column_dict = {}
    for i , line in enumerate(f):

        vals = line.rstrip().split(',')

        row_heading = vals[0]
        row_data = vals[1:]

        # Add column names as keys and empty dict as values for final data
        # Creating a header dict to keep track of index for columns
        if i ==0:
            data = {col.strip():{} for col in row_data}
            column_dict = {col.strip():i for i,col in enumerate(row_data)}
        else:
            for x in data.keys():
                #Exclude data with dashes
                if row_data[column_dict[x]].strip() != "-":
                    data[x][row_heading] = row_data[column_dict[x]]

print(data)