我有一个脚本,我从中获得输出(我也将此输出保存到f1 = 20141202.194812_carStatus /中的文件):
---------------------------------------------
TM 05120970.01: Processing...
TM 05120970: Processing...
TM 05120970: current status Open
TM 05120970: Owner_Info.User_ref = crossi14
TM 05120970: Owner_Info.Email = Criss.Rossi@gmail.com
TM 05120970: CarModel = Nissan Micra
----------------------------------------------
TM 05157414.06: Processing...
TM 05157414: Processing...
TM 05157414: current status Open
TM 05157414: Owner_Info.User_ref = yumiao12
TM 05157414: Owner_Info.Email = Yu.Miao@gmail.com
TM 05157414: CarModel = Toyota Avensis
----------------------------------------------
我使用过:exec_cmd('cat ' + f1 + '| grep -e "CarModel = " -e "Owner_Info.User_ref = "')
但我还需要块的第一行(实际上是第二行)
TM 05157414.06: Processing...
我尝试/需要做的是,解析并获取每个块的变量中的值:
TM 05120970.01 -> car_number = 05120970.01
Owner_Info.User_ref = crossi14 -> owner_user = crossi14
CarModel = Nissan Micra -> car_model = Nissan Micra
有了这些信息,我会添加一些默认的东西,如:
priority = Unknown
我将需要将此变量作为另一个名为insert_owner_car.pl
的脚本的输入 insert_owner_car.pl -id 05120970.01 -o owner_user="crossi14",car_model="Nissan Micra",priority="Unknown"
这是我到目前为止所做的工作,但由于我无法获得上述值,因此无法使用
#!/usr/bin/python
import itertools, commands, datetime, os, re, sys, time
inFile = open("/tmp/20141202.194812_carStatus")
outFile = open("result.txt", "w")
keepCurrentSet = False
for line in inFile:
if line.startswith("----------------------------------------------"):
keepCurrentSet = False
if keepCurrentSet:
parts = line.split(" = ")[1:]
part=','.join(parts)
print part
#outFile.write(parts)
if line.startswith("----------------------------------------------"):
keepCurrentSet = True
inFile.close()
outFile.close()
我不知道怎么弄:05120970.01 以及如何使一个块中的所有变量能够将它们用作该其他脚本的输入
PS:我有python 2.5.1
答案 0 :(得分:0)
您可以使用utility function open_chunk
以块的形式处理文件:
import re
import subprocess
def open_chunk(readfunc, delimiter, chunksize=1024):
"""
readfunc(chunksize) should return a string.
"""
remainder = ''
for chunk in iter(lambda: readfunc(chunksize), ''):
pieces = re.split(delimiter, remainder + chunk)
for piece in pieces[:-1]:
yield piece
remainder = pieces[-1]
if remainder:
yield remainder
f = open(filename, 'r')
for chunk in open_chunk(f.read, delimiter=r'-{45,}'):
chunk = chunk.strip()
if chunk:
lines = chunk.splitlines()
firstline = lines[0]
car_number = firstline.split()[1][:-1]
for line in lines[1:]:
if 'Owner_Info.User_ref = ' in line:
owner_user = line.split(" = ")[1]
elif 'CarModel = ' in line:
car_model = line.split(" = ")[1]
cmd = ['insert_owner_car.pl'
, '-id'
, car_number
, '-o'
, 'owner_user="%s"' % (owner_user, )
, 'car_model="%s"' % (car_model, )
, 'priority="Unknown"']
print(' '.join(cmd))
# subprocess.call(cmd)
f.close()
打印
insert_owner_car.pl -id 05120970.01 -o owner_user="crossi14" car_model="Nissan Micra" priority="Unknown"
insert_owner_car.pl -id 05157414.06 -o owner_user="yumiao12" car_model="Toyota Avensis" priority="Unknown"
如果您的数据文件很小,那么您可以将整个文件粘贴到字符串中,然后使用re.split
将其拆分为块:
In [37]: import re
In [38]: re.split(r'-{45,}', open('data').read())
Out[38]:
['\n\n',
'\nTM 05120970.01: Processing...\nTM 05120970: Processing...\nTM 05120970: current status Open\nTM 05120970: Owner_Info.User_ref = crossi14\nTM 05120970: Owner_Info.Email = Criss.Rossi@gmail.com\nTM 05120970: CarModel = Nissan Micra\n',
'\nTM 05157414.06: Processing...\nTM 05157414: Processing...\nTM 05157414: current status Open\nTM 05157414: Owner_Info.User_ref = yumiao12\nTM 05157414: Owner_Info.Email = Yu.Miao@gmail.com\nTM 05157414: CarModel = Toyota Avensis\n',
'\n']
这可以用来代替上面的open_chunk
。使用open_chunk
的优点是它可以在非常大的文件上使用,当将整个文件拖入字符串并将其拆分成列表需要太多内存时。
答案 1 :(得分:0)
您应该使用re
模块来提取相关信息:它是标准的,简单的和健壮的。
您还可以在块限制上显示块信息,并在文件末尾添加一个catch。
脚本将是:
import re
rnum = re.compile('\s*TM\s+([^\s:]+):.*')
ruser = re.compile('.*Owner_Info.User_ref\s*=\s*(.*)')
rmodel = re.compile('.*CarModel\s*=\s*(.*)')
def display(out, num, user, model):
print(num, user, model)
out.write('insert_owner_car.pl -id %s -o owner_user="%s",car_model="%s",priority="Unknown"\n' % (num, user, model))
inFile = open("/tmp/20141202.194812_carStatus")
outFile = open("result.txt", "w")
firstOfBlock = False
carnum = None
for line in inFile:
if line.startswith("--------------------------------"):
firstOfBlock = True
if carnum is not None:
display(outFile, carnum, user, model)
carnum = None
else:
if firstOfBlock:
m = rnum.match(line)
if m is not None:
carnum = m.group(1)
firstOfBlock = False
else:
line = line.strip()
m = ruser.match(line)
if m is not None:
user = m.group(1)
else:
m = rmodel.match(line)
if m is not None:
model = m.group(1)
if carnum is not None:
display(outFile, carnum, user, model)
carnum = None
inFile.close()
outFile.close()
使用当前示例,输出为
05120970.01 crossi14 Nissan Micra
05157414.06 yumiao12 Toyota Avensis
和result.txt是:
insert_owner_car.pl -id 05120970.01 -o owner_user="crossi14",car_model="Nissan Micra",priority="Unknown"
insert_owner_car.pl -id 05157414.06 -o owner_user="yumiao12",car_model="Toyota Avensis",priority="Unknown"