Python - 将文本文件读入dict

时间:2015-03-25 11:56:57

标签: python regex dictionary

我必须将python中的文本文件读入字典,我已经尝试了几个选项,但我无法使其工作。 文本文件的格式如下:

Shop: someshop  
Schedule: from 8:00 to 18:00  
Day: 11:11:2011  
Items Sold: 456  
List of purchases:  
(product, 123, 12:30)    
(product, 123, 12:30)  
(product, 123, 12:30)

我也试过使用正则表达式,但我无法弄清楚是否可以在购买列表中找到该项目。

以下是我尝试的一些代码:

d = {}
with open("sometext.txt", "r") as f:
    for line in f:

        (key, val) = line.split(': ')
        d[file] = (key,val)
        print (val)


print d

1 个答案:

答案 0 :(得分:1)

你几乎在那里;您应该使用key作为字典中的键,而不是file

(key, val) = line.split(': ')
d[key] = val.rstrip('\n')

我添加了str.strip()电话;大概你不需要在每一行的末尾存储换行符。

但是,您需要单独解析购买列表,因为这些不符合您的key: value模式。我在这里假设它是列表中的 last 条目:

d = {}
with open("sometext.txt", "r") as f:
    for line in f:
        if line.startswith('List of purchases'):
            purchases = d['List of purchases'] = []
            for line in f:
                info = line.strip('() \n').split(', ')
                purchases.append(info)
            break
        key, val = line.split(': ')
        d[key] = val.rstrip('\n')

当您阅读List of purchases行时,这会将文件的其余部分读入单独的列表中。

演示:

>>> from io import StringIO
>>> sample = '''\
... Shop: someshop  
... Schedule: from 8:00 to 18:00  
... Day: 11:11:2011  
... Items Sold: 456  
... List of purchases:  
... (product, 123, 12:30)    
... (product, 123, 12:30)  
... (product, 123, 12:30)
... '''
>>> d = {}
>>> with StringIO(sample) as f:
...     for line in f:
...         if line.startswith('List of purchases'):
...             purchases = d['List of purchases'] = []
...             for line in f:
...                 info = line.strip('()\n').split(', ')
...                 purchases.append(info)
...             break
...         key, val = line.split(': ')
...         d[key] = val.rstrip('\n')
... 
>>> d
{'Schedule': 'from 8:00 to 18:00  ', 'List of purchases': [['product', '123', '12:30'], ['product', '123', '12:30'], ['product', '123', '12:30']], 'Day': '11:11:2011  ', 'Shop': 'someshop  ', 'Items Sold': '456  '}
>>> from pprint import pprint
>>> pprint(d)
{'Day': '11:11:2011  ',
 'Items Sold': '456  ',
 'List of purchases': [['product', '123', '12:30'],
                       ['product', '123', '12:30'],
                       ['product', '123', '12:30']],
 'Schedule': 'from 8:00 to 18:00  ',
 'Shop': 'someshop  '}