使用Python将文本文件转换为字典

时间:2017-12-24 15:36:03

标签: python list file dictionary text

我有一个带有模式的文本文件:

[Badges_373382]
Deleted=0
Button2=0 1497592154
Button1=0 1497592154
ProxReader=0
StartProc=100 1509194246 ""
NextStart=0
LastSeen=1509194246
Enabled=1
Driver=Access Control
Program=AccessProxBadge
LocChg=1509120279
Name=asd
Neuron=7F0027BF2D
Owner=373381
LostSince=1509120774
Index1=218
Photo=unknown.jpg
LastProxReader=0
Temp=0
LastTemp=0
LastMotionless=0
LastMotion=1497592154
BatteryLow=0
PrevReader=10703
Reader=357862
SuspendTill=0
SuspendSince=0
Status=1001
ConvertUponDownload=0
AXSFlags=0
Params=10106
Motion=1
USER_DATA_CreationDate=6/15/2017 4:48:15 PM
OwnerOldName=asd

[Badges_373384]
Deleted=0
Button2=0 1497538610
Button1=0 1497538610
ProxReader=0
StartProc=100 1509194246 ""
NextStart=0
LastSeen=1513872678
Enabled=1
Driver=Access Control
Program=AccessProxBadge
LocChg=1513872684
Name=dsa
Neuron=7F0027CC1C
Owner=373383
LostSince=1513872723
Index1=219
Photo=unknown.jpg
LastProxReader=0
Temp=0
LastTemp=0
LastMotionless=0
LastMotion=1497538610
BatteryLow=0
PrevReader=357874
Reader=357873
SuspendTill=0
SuspendSince=0
Status=1001
ConvertUponDownload=0
AXSFlags=0
Params=10106
Motion=1
USER_DATA_CreationDate=6/15/2017 4:48:51 PM
OwnerOldName=dsa

[Badges_373386]
Deleted=0
Button2=0 1497780768
Button1=0 1497780768
ProxReader=0
StartProc=100 1509194246 ""
NextStart=0
LastSeen=1514124910
Enabled=1
Driver=Access Control
Program=AccessProxBadge
LocChg=1514124915
Name=ss
Neuron=7F0027B5FD
Owner=373385
LostSince=1514124950
Index1=220
Photo=unknown.jpg
LastProxReader=0
Temp=0
LastTemp=0
LastMotionless=0
LastMotion=1497780768
BatteryLow=0
PrevReader=357872
Reader=357871
SuspendTill=0
SuspendSince=0
Status=1001
ConvertUponDownload=0
AXSFlags=0
Params=10106
Motion=1
USER_DATA_CreationDate=6/15/2017 4:49:24 PM
OwnerOldName=ss

每一个新的"徽章" info以[Badges_number]开头,以空白行结束。

使用 Python 3.6 ,我想将此文件转换为字典,以便我可以轻松访问该信息。

它应该是这样的:

content = {"Badges_373382:{"Deleted:0,.."},"Badges_371231":{"Deleted":0,..}"}

我对如何做到这一点非常困惑,我很乐意得到一些帮助。

谢谢!

4 个答案:

答案 0 :(得分:2)

这基本上是一个INI文件,Python提供了configparser模块来解析这些文件。

import configparser
config = configparser.ConfigParser()
config.readfp(open('badges.ini'))

r = {section: dict(config[section]) for section in config.sections()}

答案 1 :(得分:0)

如果您看到[Badges_373382]格式的标题,则可以遍历每一行并跟踪:

import re
import itertools
with open('filename.txt') as f:
   f = filter(lambda x:x, [i.strip('\n') for i in f])    

new_data = [(a, list(b)) for a, b in itertools.groupby(f, key=lambda x:bool(re.findall('\[[a-zA-Z]+_+\d+\]', x)))] 
final_data = {new_data[i][-1][-1]:dict(c.split('=') for c in new_data[i+1][-1]) for i in range(0, len(new_data), 2)}

输出:

{'[Badges_373384]': {'OwnerOldName': 'dsa', 'LastMotionless': '0', 'NextStart': '0', 'Driver': 'Access Control', 'LastTemp': '0', 'USER_DATA_CreationDate': '6/15/2017 4:48:51 PM', 'Program': 'AccessProxBadge', 'LocChg': '1513872684', 'Reader': '357873', 'LostSince': '1513872723', 'LastMotion': '1497538610', 'Status': '1001', 'Deleted': '0', 'SuspendTill': '0', 'ProxReader': '0', 'LastSeen': '1513872678', 'BatteryLow': '0', 'Index1': '219', 'Name': 'dsa', 'Temp': '0', 'Enabled': '1', 'StartProc': '100 1509194246 ""', 'Motion': '1', 'Button2': '0 1497538610', 'Button1': '0 1497538610', 'SuspendSince': '0', 'ConvertUponDownload': '0', 'PrevReader': '357874', 'AXSFlags': '0', 'LastProxReader': '0', 'Photo': 'unknown.jpg', 'Neuron': '7F0027CC1C', 'Owner': '373383', 'Params': '10106'}, '[Badges_373382]': {'OwnerOldName': 'asd', 'LastMotionless': '0', 'NextStart': '0', 'Driver': 'Access Control', 'LastTemp': '0', 'USER_DATA_CreationDate': '6/15/2017 4:48:15 PM', 'Program': 'AccessProxBadge', 'LocChg': '1509120279', 'Reader': '357862', 'LostSince': '1509120774', 'LastMotion': '1497592154', 'Status': '1001', 'Deleted': '0', 'SuspendTill': '0', 'ProxReader': '0', 'LastSeen': '1509194246', 'BatteryLow': '0', 'Index1': '218', 'Name': 'asd', 'Temp': '0', 'Enabled': '1', 'StartProc': '100 1509194246 ""', 'Motion': '1', 'Button2': '0 1497592154', 'Button1': '0 1497592154', 'SuspendSince': '0', 'ConvertUponDownload': '0', 'PrevReader': '10703', 'AXSFlags': '0', 'LastProxReader': '0', 'Photo': 'unknown.jpg', 'Neuron': '7F0027BF2D', 'Owner': '373381', 'Params': '10106'}, '[Badges_373386]': {'OwnerOldName': 'ss', 'LastMotionless': '0', 'NextStart': '0', 'Driver': 'Access Control', 'LastTemp': '0', 'USER_DATA_CreationDate': '6/15/2017 4:49:24 PM', 'Program': 'AccessProxBadge', 'LocChg': '1514124915', 'Reader': '357871', 'LostSince': '1514124950', 'LastMotion': '1497780768', 'Status': '1001', 'Deleted': '0', 'SuspendTill': '0', 'ProxReader': '0', 'LastSeen': '1514124910', 'BatteryLow': '0', 'Index1': '220', 'Name': 'ss', 'Temp': '0', 'Enabled': '1', 'StartProc': '100 1509194246 ""', 'Motion': '1', 'Button2': '0 1497780768', 'Button1': '0 1497780768', 'SuspendSince': '0', 'ConvertUponDownload': '0', 'PrevReader': '357872', 'AXSFlags': '0', 'LastProxReader': '0', 'Photo': 'unknown.jpg', 'Neuron': '7F0027B5FD', 'Owner': '373385', 'Params': '10106'}}

答案 2 :(得分:0)

您可以浏览文件的每一行并添加所需内容。他们是你可以遇到的三种情况:

1。这是一个标题,它将是一个关键的最终字典。您可以在这里检查一行是否以"[Badges"开头,并在读取文件时将当前标头与临时变量一起存储。

2. 该行是一个空行,标记正在读取的当前徽章数据的结尾。您需要做的就是添加从当前徽章收集的信息,并使用正确的相应密钥将其添加到字典中。根据您的实现,您可以事先删除它们,或在读取行时保留它们。

3. 否则,该行有一些需要存储的信息。您首先需要在"="上拆分此信息,并将其存储在字典中。

通过这些建议,您可以编写类似的内容来完成此任务:

from collections import defaultdict

# dictionary of dictionary values
data = defaultdict(dict)

with open('pattern.txt') as file:
    lines = [line.strip('\n') for line in file]

    # keeps track of current header
    header = None

    # case 2, deletes empty lines before hand
    valid_lines = [line for line in lines if line]

    for line in valid_lines:

        # case 1, for headers
        if line.startswith('[Badges'):

            # updates current header, and deletes square brackets
            header = line.replace('[', '').replace(']', '')

        # case 3, data has been found
        else:

            # split and add the data
            info = line.split('=')
            key, value = info[0], info[1]
            data[header][key] = value

print(dict(data))

哪个输出:

{'Badges_373382': {'Deleted': '0', 'Button2': '0 1497592154', 'Button1': '0 1497592154', 'ProxReader': '0', 'StartProc': '100 1509194246 ""', 'NextStart': '0', 'LastSeen': '1509194246', 'Enabled': '1', 'Driver': 'Access Control', 'Program': 'AccessProxBadge', 'LocChg': '1509120279', 'Name': 'asd', 'Neuron': '7F0027BF2D', 'Owner': '373381', 'LostSince': '1509120774', 'Index1': '218', 'Photo': 'unknown.jpg', 'LastProxReader': '0', 'Temp': '0', 'LastTemp': '0', 'LastMotionless': '0', 'LastMotion': '1497592154', 'BatteryLow': '0', 'PrevReader': '10703', 'Reader': '357862', 'SuspendTill': '0', 'SuspendSince': '0', 'Status': '1001', 'ConvertUponDownload': '0', 'AXSFlags': '0', 'Params': '10106', 'Motion': '1', 'USER_DATA_CreationDate': '6/15/2017 4:48:15 PM', 'OwnerOldName': 'asd'}, 'Badges_373384': {'Deleted': '0', 'Button2': '0 1497538610', 'Button1': '0 1497538610', 'ProxReader': '0', 'StartProc': '100 1509194246 ""', 'NextStart': '0', 'LastSeen': '1513872678', 'Enabled': '1', 'Driver': 'Access Control', 'Program': 'AccessProxBadge', 'LocChg': '1513872684', 'Name': 'dsa', 'Neuron': '7F0027CC1C', 'Owner': '373383', 'LostSince': '1513872723', 'Index1': '219', 'Photo': 'unknown.jpg', 'LastProxReader': '0', 'Temp': '0', 'LastTemp': '0', 'LastMotionless': '0', 'LastMotion': '1497538610', 'BatteryLow': '0', 'PrevReader': '357874', 'Reader': '357873', 'SuspendTill': '0', 'SuspendSince': '0', 'Status': '1001', 'ConvertUponDownload': '0', 'AXSFlags': '0', 'Params': '10106', 'Motion': '1', 'USER_DATA_CreationDate': '6/15/2017 4:48:51 PM', 'OwnerOldName': 'dsa'}, 'Badges_373386': {'Deleted': '0', 'Button2': '0 1497780768', 'Button1': '0 1497780768', 'ProxReader': '0', 'StartProc': '100 1509194246 ""', 'NextStart': '0', 'LastSeen': '1514124910', 'Enabled': '1', 'Driver': 'Access Control', 'Program': 'AccessProxBadge', 'LocChg': '1514124915', 'Name': 'ss', 'Neuron': '7F0027B5FD', 'Owner': '373385', 'LostSince': '1514124950', 'Index1': '220', 'Photo': 'unknown.jpg', 'LastProxReader': '0', 'Temp': '0', 'LastTemp': '0', 'LastMotionless': '0', 'LastMotion': '1497780768', 'BatteryLow': '0', 'PrevReader': '357872', 'Reader': '357871', 'SuspendTill': '0', 'SuspendSince': '0', 'Status': '1001', 'ConvertUponDownload': '0', 'AXSFlags': '0', 'Params': '10106', 'Motion': '1', 'USER_DATA_CreationDate': '6/15/2017 4:49:24 PM', 'OwnerOldName': 'ss'}}

注意:以上代码只是一种可能性,可以根据您的需求进行调整,甚至可以改进它。

我还使用collections.defaultdict添加数据,因为它更易于使用。您还可以在末尾包装dict()以将其转换为普通字典,这是可选的。

答案 3 :(得分:0)

您可以尝试使用正则表达式并拆分输出结果:

pattern='^\[Badges.+?OwnerOldName=\w+'
import re
with open('file.txt','r') as f:
    match=re.finditer(pattern,f.read(),re.DOTALL | re.MULTILINE)
    new=[]
    for kk in match:
        if kk.group()!='\n':
            new.append(kk.group())
    print({i.split()[0]:i.split()[1:] for i in new})

输出:

  {'[Badges_373384]': ['Deleted=0', 'Button2=0', '1497538610', 'Button1=0', '1497538610', 'ProxReader=0', 'StartProc=100', '1509194246', '""', 'NextStart=0', 'LastSeen=1513872678', 'Enabled=1', 'Driver=Access', 'Control', 'Program=AccessProxBadge', 'LocChg=1513872684', 'Name=dsa', 'Neuron=7F0027CC1C', 'Owner=373383', 'LostSince=1513872723', 'Index1=219', 'Photo=unknown.jpg', 'LastProxReader=0', 'Temp=0', 'LastTemp=0', 'LastMotionless=0', 'LastMotion=1497538610', 'BatteryLow=0', 'PrevReader=357874', 'Reader=357873', 'SuspendTill=0', 'SuspendSince=0', 'Status=1001', 'ConvertUponDownload=0', 'AXSFlags=0', 'Params=10106', 'Motion=1', 'USER_DATA_CreationDate=6/15/2017', '4:48:51', 'PM', 'OwnerOldName=dsa'], '[Badges_373382]': ['Deleted=0', 'Button2=0', '1497592154', 'Button1=0', '1497592154', 'ProxReader=0', 'StartProc=100', '1509194246', '""', 'NextStart=0', 'LastSeen=1509194246', 'Enabled=1', 'Driver=Access', 'Control', 'Program=AccessProxBadge', 'LocChg=1509120279', 'Name=asd', 'Neuron=7F0027BF2D', 'Owner=373381', 'LostSince=1509120774', 'Index1=218', 'Photo=unknown.jpg', 'LastProxReader=0', 'Temp=0', 'LastTemp=0', 'LastMotionless=0', 'LastMotion=1497592154', 'BatteryLow=0', 'PrevReader=10703', 'Reader=357862', 'SuspendTill=0', 'SuspendSince=0', 'Status=1001', 'ConvertUponDownload=0', 'AXSFlags=0', 'Params=10106', 'Motion=1', 'USER_DATA_CreationDate=6/15/2017', '4:48:15', 'PM', 'OwnerOldName=asd'], '[Badges_373386]': ['Deleted=0', 'Button2=0', '1497780768', 'Button1=0', '1497780768', 'ProxReader=0', 'StartProc=100', '1509194246', '""', 'NextStart=0', 'LastSeen=1514124910', 'Enabled=1', 'Driver=Access', 'Control', 'Program=AccessProxBadge', 'LocChg=1514124915', 'Name=ss', 'Neuron=7F0027B5FD', 'Owner=373385', 'LostSince=1514124950', 'Index1=220', 'Photo=unknown.jpg', 'LastProxReader=0', 'Temp=0', 'LastTemp=0', 'LastMotionless=0', 'LastMotion=1497780768', 'BatteryLow=0', 'PrevReader=357872', 'Reader=357871', 'SuspendTill=0', 'SuspendSince=0', 'Status=1001', 'ConvertUponDownload=0', 'AXSFlags=0', 'Params=10106', 'Motion=1', 'USER_DATA_CreationDate=6/15/2017', '4:49:24', 'PM', 'OwnerOldName=ss']}