Question

我正在尝试从下面的字符串文本示例中解析出几个数字和名称该文件类似于以下格式。

1. 00054 **/ 063076600 NAME** Days Off: 21 Cr:021:00 
2. CAPS ALLL +++ VSVS
3. more lines of text
4. 00054 / 063076600 NAME Days Off: 21 Cr:074:30 
5. CAPS ALLL +++ VSVS
6. more lines of text....

我需要解析“/”后面的数字（063076600）和名称（NAME）。包含这些字段的行始终包含“/”。 NAME也在所有上限中。我尝试使用str.isupper（）作为名称字段，但我不需要的很多文本都在第2行的所有大写中，所以这不起作用。

我是否可以通过某种方式指定如何获取“/”后的2个项目并将其添加到列表中？

fname =raw_input('Enter the filename:')
listOfnames = []
try:
   fhand = open(fname)
except:
   print 'File cannot be opened',fname
   exit()
count = 0
with open(fname) as f:
   for line in f:
       # break line to words
       for word in line.strip().split():
           if word.startswith('/'):
            #get the number after "/" and append
            #get the NAME and append
               count = count + 1
               listOfnames.append(word)
               try:
                   print "number is", number
                   print "name is" , name
               except:
                   print "not available"
print listOfnames
print 'count is',count

Answer 1

def handle_input(fhand):
    listofnames = []
    count = 0
    try:
        with open(fhand,'r') as data:
            for i in data:
                if '/' in i:
                    number, name =  filter (lambda x: x,i.split('/')[1].split(' '))[:2]
                    count += 1
                    listofnames.append([name,number])
        if listofnames: print 'count:{}\nnames:\n{}'.format(count,''.join( map(lambda x: "\t{}: {}\n".format(x[0],x[1]),listofnames)))
    except Exception:
        print 'not available'

handle_input('test.txt')

结果：

count:2
names:
    NAME: 063076600
    NAME: 063076600

如何提取和解析字符串文本文件中的字符串字段

1 个答案: