如果该行中包含“command”,我想从文本doc中提取每行中的第二个数字。我想要命令,其余部分打印在那些数字旁边。有数百行。
线条如下:
1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62
这条线如果编程我应该如何出来
1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62
我该怎么做?
答案 0 :(得分:3)
使用csv
module处理CSV数据(尽管用管道分隔)等数据:
import csv
with open('inputfile', 'rb') as inputfile:
reader = csv.reader(inputfile, delimiter='|')
for row in reader:
if len(row) > 5 and row[5].lower().startswith('command'):
print row[1], row[5]
csv.reader()
为你提供了一个迭代器,为每一行产生一个列表;您的样本行将导致:
['1376328501.285', '1166703600', '0', 'SimControl', '4', 'Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62']
指数从0开始,因此Command
文本的列为row[5]
;第二列编号位于row[1]
。上面的代码测试当前行中是否有足够的列,而row[5]
是否为小写,则以command
开头。
以上假设是Python 2;对于Python 3,它看起来略有不同:
import csv
with open('inputfile', newline='') as inputfile:
reader = csv.reader(inputfile, delimiter='|')
for row in reader:
if len(row) > 5 and row[5].lower().startswith('command'):
print(row[1], row[5])
答案 1 :(得分:0)
>>> l = """1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62"""
>>> l = [l,l,l]
>>> [ele.split("|")[1] for ele in l if "command" in ele.lower()]
['1166703600', '1166703600', '1166703600']
答案 2 :(得分:0)
lines = '1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62'
if 'Command' in lines:
lines_lst = lines.split('|')
what_you_want = lines_lst[1] + ' '+ lines_lst[-1]
print what_you_want
>>> 1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62
所以,如果你有一个包含数千行这样的文件:
f = open(YOUR_FILE, 'r')
data = f.readlines()
f.close()
foo = []
for lines in data:
if 'Command' in lines:
lines_lst = lines.split('|')
what_you_want = lines_lst[1] + ' '+ lines_lst[-1]
foo.append(what_you_want)
答案 3 :(得分:0)
import re
s = '''
1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee
'''
print s
regx = re.compile('^[^|]+\|([^|]+).+?(Command.+\n?)',
re.MULTILINE)
print ''.join('%s %s' % m.groups() for m in regx.finditer(s))
结果
1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee
1166703600 Command aaaaa
11660 Command bbb
-2.87 Command eeee