python如果符合条件,则在一行中提取第二个数字

时间:2013-10-30 17:44:46

标签: python text numbers extract

如果该行中包含“command”,我想从文本doc中提取每行中的第二个数字。我想要命令,其余部分打印在那些数字旁边。有数百行。

线条如下:

1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

这条线如果编程我应该如何出来

1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

我该怎么做?

4 个答案:

答案 0 :(得分:3)

使用csv module处理CSV数据(尽管用管道分隔)等数据:

import csv

with open('inputfile', 'rb') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print row[1], row[5]

csv.reader()为你提供了一个迭代器,为每一行产生一个列表;您的样本行将导致:

['1376328501.285', '1166703600', '0', 'SimControl', '4', 'Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62']

指数从0开始,因此Command文本的列为row[5];第二列编号位于row[1]。上面的代码测试当前行中是否有足够的列,而row[5]是否为小写,则以command开头。

以上假设是Python 2;对于Python 3,它看起来略有不同:

import csv

with open('inputfile', newline='') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print(row[1], row[5])

答案 1 :(得分:0)

>>> l = """1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62"""
>>> l = [l,l,l]

>>> [ele.split("|")[1] for ele in l if "command" in ele.lower()]
['1166703600', '1166703600', '1166703600']

答案 2 :(得分:0)

lines = '1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62'

if 'Command' in lines:
    lines_lst = lines.split('|')
    what_you_want = lines_lst[1] + ' '+ lines_lst[-1]

print what_you_want
>>> 1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

所以,如果你有一个包含数千行这样的文件:

f = open(YOUR_FILE, 'r')
data = f.readlines()
f.close()

foo = []
for lines in data:
    if 'Command' in lines:
        lines_lst = lines.split('|')
        what_you_want = lines_lst[1] + ' '+ lines_lst[-1]
        foo.append(what_you_want)

答案 3 :(得分:0)

import re

s = '''
1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee
'''
print s

regx = re.compile('^[^|]+\|([^|]+).+?(Command.+\n?)',
                  re.MULTILINE)

print ''.join('%s %s' % m.groups() for m in regx.finditer(s))

结果

1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee

1166703600 Command aaaaa
11660 Command bbb
-2.87 Command eeee