使用python解析文件并在块之间提取行

时间:2015-07-28 18:02:09

标签: python

我有一个文件,我需要解析并从中提取一些特定的行。这是文件数据的示例:

dn: uid=portaladmin,ou=people,ou=myrealm,dc=portalDomain
objectclass: wlsUser objectclass: top 
objectclass: person 
objectclass: organizationalPerson 
objectclass: inetOrgPerson 
cn: portaladmin 
sn: portaladmin 
description: Admin for portal domain 
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o= 
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDom  ain

dn: uid=weblogic,ou=people,ou=myrealm,dc=portalDomain 
objectclass: wlsUser 
objectclass: top 
objectclass: person 
objectclass: organizationalPerson 
objectclass: inetOrgPerson 
cn: weblogic 
sn: weblogic 
description: This user is the default administrator. 
uid: weblogic 
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8= 
wlsMemberOf: cn=Administrators,ou=groups,ou=myrealm,dc=portalDomain 
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDomain

正如您所看到的那样,信息是块状的,我需要使用(cn:sn:description:uid:userpassword:来提取行值,还需要告诉脚本从列表中搜索特定内容uidcn

我不是一位经验丰富的程序员,这也是我来这里问大师的原因。请提前帮助,谢谢。

2 个答案:

答案 0 :(得分:1)

只需使用str.startswith找到这些行,传递一个子串的元组:

with open("in.txt") as f:
    for line in f:
        if line.startswith(("cn:","sn:", "description:", "uid:","userpassword:")):
            print(line.rstrip())

输出:

cn: portaladmin
sn: portaladmin
description: Admin for portal domain
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o=
cn: weblogic
sn: weblogic
description: This user is the default administrator.
uid: weblogic
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8=

根据您的评论,如果您要搜索子字符串,可以使用any

  if any(sub in line for sub in ("cn: somestring", "sn: somestring", "description: somestring", "uid: somestring", "userpassword: somestring")):

如果模式更复杂,那么你可能需要一个正则表达式,但不知道你想要提取什么,那么就不可能建议一个可行的正则表达式

答案 1 :(得分:-1)

extractedLines = []
with open("file.txt", "r") as f:
    for line in f:
        for item in ["cn:", "sn:", "description:", "uid:", "userpassword:"]:
            if item in line:
                extractedLines.append(line)