读取文本文件并创建Python字典

时间:2018-08-18 16:25:36

标签: python python-3.x

我有一个如下所述的文本文件:

KEY,NAME,RANK,BOOKNAME,SCORE,AUTHER
123,ABCD,500,FREEDOM1,15200,PXYZ
133,EFGH,400,FREEDOM2,15300.5,XTYZ
nan,SYGH,700,FREEDOM3,15400,RYYZ
143,LKMN,800,FREEDOM4,15500.5,XYCZ

我想阅读此文本文件并创建一个嵌套的字典,该字典将在后续程序中使用。

dict = {
123:{'NAME':'ABCD','RANK':500,'BOOKNAME':'FREEDOM1', 'SCORE':15200, 'AUTHER':'PXYZ'},
133:{'NAME':'EFGH','RANK':400,'BOOKNAME':'FREEDOM2', 'SCORE':15300.5, 'AUTHER':'XTYZ'},
143:{'NAME':'LKMN','RANK':800,'BOOKNAME':'FREEDOM4', 'SCORE':15500.5, 'AUTHER':'XYCZ'}
}

注意:代码应删除具有“ nan”键值的行

3 个答案:

答案 0 :(得分:0)

您可以像这样使用csv模块。如果需要检查KEY值是否为数字,请创建相应的函数:

import csv

def is_float(s):
    try:
        float(s)
    except ValueError:
        return  False
    return True


with open('input.csv') as f:
    reader = csv.DictReader(f)
    rows = list(dict(a) for a in iter(reader) if is_float(a['KEY']))

print(rows)

答案 1 :(得分:0)

实现目标所需要做的事情

首先,您需要打开文件(假设其.txt文件包含逗号分隔值)

filename = "csv_data.txt"
file = open(filename, "r") #opening in read mode
line_list = []
for line in file:
   print(line) #line_list.append(line.strip().split(','))

然后,您需要使用','作为分隔符来分隔字符串( line ),为此,您必须进行line.split(','),这会给您列表。

line_list[0] 

在这里您将找到文本文件第1行中所有字符串的列表。

好的,我决定添加代码,但请不要复制粘贴,请尝试在Google上理解它,或者使用python文档查看每个内置函数的作用。

from collections import defaultdict

filename = "csv_data.txt"
file = open(filename, "r") #opening in read mode
line_list = []
output_dict = defaultdict(dict) #read about defualtdict vs dict

for line in file:
    #print(line,end='')
    line_list.append(line.strip().split(','))


key_names = line_list[0] #remember firstline in our file contains name of keys

#read about slicing
for line in line_list[1:]:
    #print(line)
    this_key = line[0]
    if this_key == 'nan':
        continue #we don't want to add this to our dict

    else:
        this_key = int(this_key)
        output_dict[this_key]= defaultdict(dict)

        # read about enumerate
        for i,word in enumerate(line[1:], start = 1):

            this_key_dict =  output_dict[this_key]   
            if key_names[i] == 'SCORE' or key_names[i] == 'RANK':
                try:
                   word = int(word)

                except ValueError:
                   word = float(word)  

            this_key_dict[key_names[i]] = word




def nice_print(dict_d):

    for i,v in dict_d.items():
        print(i,v)


nice_print(output_dict)


>>> word = '7.8'
>>> float(word) if '.' in  word else int(word)
7.8
>>> word = '7'
>>> float(word) if '.' in  word else int(word)
7
>>>

答案 2 :(得分:0)

您可以使用csv.DictReader从数据文件中创建OrderedDicts列表。然后,您可以重新排列和转换数据,以使嵌套字典满足您的要求。这是使用字典理解的示例。

import csv

with open('text.csv') as f:
    reader = csv.DictReader(f)
    result = {
        int(d['KEY']):{k: int(v) if v.isdigit() else v for k, v in d.items() if k != 'KEY'}
        for d in reader if d['KEY'].isdigit()}
    print(result)

编辑:如果您所需要的只是Tanmay解决方案中发布的string值,那么只需更少的代码即可完成操作。

import csv
from pprint import pprint

with open('text.csv') as f:
    results = {d.pop('KEY'): dict(d) for d in csv.DictReader(f)}
pprint(results)

编辑2:强制转换值

import csv
from pprint import pprint
import re


def cast_dict(d: dict):
    def cast_value(value: str):
        if value.isdigit():
            return int(value)
        elif re.match(r'^\d+\.\d+$', value):
            return float(value)
        return value
    return {k: cast_value(v) for k, v in d.items()}


with open('text.csv') as f:
    results = {int(d.pop('KEY')): cast_dict(d) for d in csv.DictReader(f) if d.get['KEY'].isdigit()}

pprint(结果)     pprint(结果)