我有一个文件,我需要解析并从中提取一些特定的行。这是文件数据的示例:
dn: uid=portaladmin,ou=people,ou=myrealm,dc=portalDomain
objectclass: wlsUser objectclass: top
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
cn: portaladmin
sn: portaladmin
description: Admin for portal domain
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o=
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDom ain
dn: uid=weblogic,ou=people,ou=myrealm,dc=portalDomain
objectclass: wlsUser
objectclass: top
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
cn: weblogic
sn: weblogic
description: This user is the default administrator.
uid: weblogic
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8=
wlsMemberOf: cn=Administrators,ou=groups,ou=myrealm,dc=portalDomain
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDomain
正如您所看到的那样,信息是块状的,我需要使用(cn:
,sn:
,description:
,uid:
和userpassword:
来提取行值,还需要告诉脚本从列表中搜索特定内容uid
或cn
。
我不是一位经验丰富的程序员,这也是我来这里问大师的原因。请提前帮助,谢谢。
答案 0 :(得分:1)
只需使用str.startswith找到这些行,传递一个子串的元组:
with open("in.txt") as f:
for line in f:
if line.startswith(("cn:","sn:", "description:", "uid:","userpassword:")):
print(line.rstrip())
输出:
cn: portaladmin
sn: portaladmin
description: Admin for portal domain
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o=
cn: weblogic
sn: weblogic
description: This user is the default administrator.
uid: weblogic
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8=
根据您的评论,如果您要搜索子字符串,可以使用any
:
if any(sub in line for sub in ("cn: somestring", "sn: somestring", "description: somestring", "uid: somestring", "userpassword: somestring")):
如果模式更复杂,那么你可能需要一个正则表达式,但不知道你想要提取什么,那么就不可能建议一个可行的正则表达式
答案 1 :(得分:-1)
extractedLines = []
with open("file.txt", "r") as f:
for line in f:
for item in ["cn:", "sn:", "description:", "uid:", "userpassword:"]:
if item in line:
extractedLines.append(line)