Question

如果该行中包含“command”，我想从文本doc中提取每行中的第二个数字。我想要命令，其余部分打印在那些数字旁边。有数百行。

线条如下：

1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

这条线如果编程我应该如何出来

1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

我该怎么做？

Answer 1

使用csv module处理CSV数据（尽管用管道分隔）等数据：

import csv

with open('inputfile', 'rb') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print row[1], row[5]

csv.reader()为你提供了一个迭代器，为每一行产生一个列表;您的样本行将导致：

['1376328501.285', '1166703600', '0', 'SimControl', '4', 'Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62']

指数从0开始，因此Command文本的列为row[5];第二列编号位于row[1]。上面的代码测试当前行中是否有足够的列，而row[5]是否为小写，则以command开头。

以上假设是Python 2;对于Python 3，它看起来略有不同：

import csv

with open('inputfile', newline='') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print(row[1], row[5])

Answer 2

>>> l = """1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62"""
>>> l = [l,l,l]

>>> [ele.split("|")[1] for ele in l if "command" in ele.lower()]
['1166703600', '1166703600', '1166703600']

Answer 3

lines = '1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62'

if 'Command' in lines:
    lines_lst = lines.split('|')
    what_you_want = lines_lst[1] + ' '+ lines_lst[-1]

print what_you_want
>>> 1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

所以，如果你有一个包含数千行这样的文件：

f = open(YOUR_FILE, 'r')
data = f.readlines()
f.close()

foo = []
for lines in data:
    if 'Command' in lines:
        lines_lst = lines.split('|')
        what_you_want = lines_lst[1] + ' '+ lines_lst[-1]
        foo.append(what_you_want)

Answer 4

import re

s = '''
1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee
'''
print s

regx = re.compile('^[^|]+\|([^|]+).+?(Command.+\n?)',
                  re.MULTILINE)

print ''.join('%s %s' % m.groups() for m in regx.finditer(s))

结果

1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee

1166703600 Command aaaaa
11660 Command bbb
-2.87 Command eeee

python如果符合条件，则在一行中提取第二个数字

4 个答案: