我有一个文本文件,我需要根据最后的匹配条件读取行。例如,在最后一次出现特定word
或string
之后读取所有行直到文件末尾。
示例文件:
2016 Jun 01 13:48:46:590 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300006 Engine COMPLEXITY_CALCULATOR-GenerateComplexitySheet terminating
2016 Jun 01 13:50:47:692 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300001 Process Engine version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:702 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300009 BW Plugins: version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:48:46:590 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300006 Engine COMPLEXITY_CALCULATOR-GenerateComplexitySheet terminating
2016 Jun 01 13:50:47:692 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300001 Process Engine version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:702 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300009 BW Plugins: version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:710 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300011 Java version: Java HotSpot(TM) 64-Bit Server VM 23.3-b01
2016 Jun 01 13:50:47:711 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300012 OS version: amd64 Linux 2.6.32-573.3.1.el6.x86_64
2016 Jun 01 13:50:51:776 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300002 Engine COMPLEXITY_CALCULATOR-GenerateComplexitySheet started
从上面的文件中我想读取字符串last occurrence
COMPLEXITY_CALCULATOR-GenerateComplexitySheet terminating
之后的所有行
预期产出:
2016 Jun 01 13:50:47:692 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300001 Process Engine version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:702 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300009 BW Plugins: version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:710 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300011 Java version: Java HotSpot(TM) 64-Bit Server VM 23.3-b01
2016 Jun 01 13:50:47:711 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300012 OS version: amd64 Linux 2.6.32-573.3.1.el6.x86_64
2016 Jun 01 13:50:51:776 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300002 Engine COMPLEXITY_CALCULATOR-GenerateComplexitySheet started
答案 0 :(得分:2)
由于你在groovy标签上发布了这个,我希望你可以从shell中使用groovy。我编写了以下脚本。虽然它遍历文件两次,但它将以流式方式工作,不会炸毁你的记忆:
f = new File("sample.txt")
def lastIndex
f.eachLine { line, index ->
if (line.contains("GenerateComplexitySheet terminating")) {
lastIndex = index + 1
}
}
new File("out.txt").with {
write ""
withWriter { writer ->
f.eachLine { line, index ->
if (index >= lastIndex) {
writer.writeLine line
}
}
}
assert text == '''2016 Jun 01 13:50:47:692 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300001 Process Engine version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:702 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300009 BW Plugins: version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:710 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300011 Java version: Java HotSpot(TM) 64-Bit Server VM 23.3-b01
2016 Jun 01 13:50:47:711 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300012 OS version: amd64 Linux 2.6.32-573.3.1.el6.x86_64
2016 Jun 01 13:50:51:776 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300002 Engine COMPLEXITY_CALCULATOR-GenerateComplexitySheet started
'''
}
答案 1 :(得分:2)
试试这个: -
def file = new File("file.txt")
def index = file.findLastIndexOf {it =~ "COMPLEXITY_CALCULATOR-GenerateComplexitySheet terminating" }
def lines = file.readLines()
lines[(index+1)..(lines.size()-1)].each { println it }
输出: -
2016 Jun 01 13:50:47:692 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300001 Process Engine version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:702 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300009 BW Plugins: version 5.11.0, build V62_hotfix017, 2015-9-24
2016 Jun 01 13:50:47:710 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300011 Java version: Java HotSpot(TM) 64-Bit Server VM 23.3-b01
2016 Jun 01 13:50:47:711 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300012 OS version: amd64 Linux 2.6.32-573.3.1.el6.x86_64
2016 Jun 01 13:50:51:776 GMT +0200 BW.COMPLEXITY_CALCULATOR-GenerateComplexitySheet Info [BW-Core] BWENGINE-300002 Engine COM PLEXITY_CALCULATOR-GenerateComplexitySheet started
希望它会对你有所帮助.. :)
答案 2 :(得分:2)
无论语言如何,有两种算法可以实现这一目标:
首先:
initialise a temporary store (memory or temp file)
open input
while(read line) {
if(line matches search pattern) {
clear temp store
}
write line to temp store
}
copy temp store to output
第二
open input
while(read line) {
if(line matches search pattern) {
store line number in variable
}
}
close input
open input again
read until stored line number
read / write until end
第一个选项的优点是它适用于管道输入,您无法在开始时重新打开输入。但它的缺点是你必须将输出线存储在某个临时位置,直到你到达输入的最后一行。
第二个选项的优点是它一次只能在内存中保存一行输入。它的缺点是它永远无法使用它从一开始就可以重新打开的输入源。
您应该能够在Groovy或shell中轻松实现其中任何一个。
在shell中,如果输入是文件,则可以拼凑第二种算法的版本:
tail --lines=+$(grep -n pattern input.txt | tail -1 | cut -d: -f1) input.txt
我们在此处使用grep -n
查找匹配的行(包含行号),tail -1
选择最后一行,cut
以提取行号,以及{ {1}}将这些行写入stdout。