我编写了一个python程序来打印gedcom文件的每一行,其级别为no和tag(gedcom是一个基本上属于家谱的文件)。
gedcom的每一行都有如下结构
<level-number> <tag> <arguments>
现在我不希望打印所有标签,但只打印我在key_words列表中添加的特定标签,其余我想打印“无效标签”。现在的问题是,即使找到匹配的标签并打印,每次都会打印“无效标签”。基本上,如果每次都执行语句。
我该如何解决这个问题?我怎么能处理'INDI'一词,因为它没有打印
这是我的代码
key_words = ['INDI','NAME','SEX','BIRT','DEAT','FAMC','FAMS','FAM','MARR','HUSB','WIFE','CHIL','DIV','DATE','HEAD','TRLR','NOTE']
#opening file
text_file = open('C:\Users\shree\Canopy\My-Family-18-May-2016-582.ged', 'r')
print "Printing each line of gedcom file followed by level no and tag line"
for line in text_file:
print "line is:-", line
level_number = int(line[:1])
print "Level number is",level_number
line = line.split()
for word in key_words:
if word in line:
print "Tage is:-",word,"\n"
else:
print "invalid tag"
示例行
0 HEAD
1 SOUR Family Echo
2 WWW http://www.familyecho.com/
1 FILE My Family
1 DATE 18 MAY 2016
1 DEST ANSTFILE
1 GEDC
2 VERS 5.5.1
2 FORM LINEAGE-LINKED
1 SUBM @I1@
2 NAME Nico Rosberg
1 SUBN
1 CHAR UTF-8
0 @I1@ INDI
1 NAME Nico /Rosberg/
2 GIVN Nico
2 SURN Rosberg
2 _MARNM Rosberg
1 SEX M
1 BIRT
2 DATE 21 MAR 1989
1 FAMC @F1@
0 @I2@ INDI
1 NAME Tom /Rosberg/
2 GIVN Tom
2 SURN Rosberg
2 _MARNM Rosberg
1 SEX M
1 BIRT
2 DATE 15 MAR 1958
1 FAMS @F1@
1 FAMC @F2@
0 @I3@ INDI
1 NAME Laisly /Vettle/
2 GIVN Laisly
2 SURN Vettle
2 _MARNM Rosberg
1 SEX F
1 BIRT
2 DATE 15 SEP 1958
1 FAMS @F1@
1 FAMC @F3@
答案 0 :(得分:0)
看起来你想要的是这个:
line_words = line.split()
# get the first element since that is the tag of line
line_tag = line_words[1].strip()
# check if that is present in the keywords
if line_tag in key_words:
print "Tag is:-",line_tag,"\n"
else:
print "invalid tag"