如何处理每个结果 - 而不是grep(oz)行命令(早于2.25)

时间:2017-02-06 10:26:31

标签: regex linux bash shell grep

从版本2.25开始,http://yourdomain.com/filter/2017--ABC--Z-400-TT是固定的,因此使用空字节而不是换行来终止输出行。 这对于捕获和处理多行grep匹配很简单(参见示例)

不幸的是,我在生产时遇到了grep版本2.20。 这意味着对于处理\ n终止的日志文件,您无法区分每个输出行的grep-match。

因此我的问题是:

处理每个结果的最有效方法是什么 - 当你遇到2.25之前的版本时,不是使用grep(oz)行命令?

(注意:这是一个更复杂的脚本的一个小例子,需要根据请求处理超过10k的大型日志文件,因此我寻求最有效的#34;解决方案)

一个简单的例子:

test.log中

flag test1
flag test2
flag test3
    test4
    test5
flag test6

test7

flag test8

test.sh

#!/bin/bash
#regex explained: 
#(?s)enable multiline pattern search
#(flag) capturegroup with pattern indicating new entry
#[[:blank:]] followed by a space
#(.*?) capturegroup for the rest of the entry, non-greedy
#(?=(?:\r\n|[\r\n])(flag)|\z) positive lookahead: 
# - stop when the next newline begins with flag 
# - OR if last entry is a match: proceed 'till end of entry

regex_multiline="(?s)(flag)[[:blank:]](.*?)(?=(?:\r\n|[\r\n])(flag)|\z)"
logfile="./test.log"

test1(){
    #this works only with grep 2.25 or higher, 
    #which returns a NULL-byte delimiter after each capture
    echo start
    while IFS= read -r -d '' line ; do
        printf '<test>%s</test>\n' "$line"
    done < <(grep -Pzo $regex_multiline $logfile)
    echo end
}

test2(){
    #I need this to work for each match, instead of each line
    echo start
    while IFS= read -r line ; do
        printf '<test>%s</test>\n' "$line"
    done < <(grep -Pzo $regex_multiline $logfile)
    echo end
}

测试1导致我想要的东西:

start
<test>flag test1</test>
<test>flag test2</test>
<test>flag test3
        test4
        test5</test>
<test>flag test6

test7
 </test>
<test>flag test8</test>
end

测试2结果

start
<test>flag test1</test>
<test>flag test2</test>
<test>flag test3</test>
<test>       test4</test>
<test>       test5</test>
<test>flag test6</test>
<test></test>
<test>test7</test>
<test> </test>
<test>flag test8</test>
end

2 个答案:

答案 0 :(得分:1)

我认为您最好使用#include <iostream> #include <cstdlib> #include <ctime> #include <string> #include <windows.h> using namespace std; //This function is only here to make retrieving random numbers more pretty. int getrand(int min, int max); //This function will return what level of rarity the item is as a string. string getrare(); //This function will return the certification of the item as a string. string getcert(); //This function will return the crate being opened to a string, and will error check. string cratePrompt(); //This function will return the specific item received from a specific crate. string getitem(string crateNum); int rareCount=0; int veryrareCount=0; int importCount=0; int exoticCount=0; int blackmarketCount=0; int certCount=0; int main(){ unsigned seed; //This group of text here is to get a seed for randomness. seed=time(0); srand(seed); //This is where the magic happens. int count=0; int crateQuantity; string certTest; string crateChosen; crateChosen=(cratePrompt()); cout<<"\nHow many crates do you want to open?\n\n"; cin>>crateQuantity; cout<<endl; string itemActual; while(count<crateQuantity){ certTest=getcert(); if (certTest!="Null"){ cout<<certTest<<" "; } cout<<getitem(crateChosen); cout<<endl; count++; } cout<<"\n\n\nYou got "<<rareCount<<" Rare items!\n"; cout<<"You got "<<veryrareCount<<" Very Rare items!\n"; cout<<"You got "<<importCount<<" Import items!\n"; cout<<"You got "<<exoticCount<<" Exotic items!\n"; cout<<"You got "<<blackmarketCount<<" Black Market items!\n\n"; cout<<certCount<<" of the items were certified!"; } int getrand(int min, int max){ return ((rand()%(max-min+1))+min); } string getrare(){ int rando=(getrand(1,10000)); if ((rando>=1)&&(rando<=5472)){ rareCount++; return "Rare"; } else if ((rando>=5473)&&(rando<=8264)){ veryrareCount++; return "Very Rare"; } else if ((rando>=8265)&&(rando<=9476)){ importCount++; return "Import"; } else if ((rando>=9477)&&(rando<=9868)){ exoticCount++; return "Exotic"; } else if ((rando>=9869)&&(rando<=10000)){ blackmarketCount++; return "Black Market"; } else{ return "An error has happened"; } } string getcert(){ int iscert=getrand(1,100); int whichcert=getrand(1,15); if ((iscert>=16)&&(iscert<=100)){ return "Null"; } else if ((iscert>=1)&&(iscert<=15)){ certCount++; if (whichcert==1){ return "[Acrobat]"; } if (whichcert==2){ return "[Aviator]"; } if (whichcert==3){ return "[Goalkeeper]"; } if (whichcert==4){ return "[Guardian]"; } if (whichcert==5){ return "[Juggler]"; } if (whichcert==6){ return "[Paragon]"; } if (whichcert==7){ return "[Playmaker]"; } if (whichcert==8){ return "[Scorer]"; } if (whichcert==9){ return "[Show-Off]"; } if (whichcert==10){ return "[Sniper]"; } if (whichcert==11){ return "[Striker]"; } if (whichcert==12){ return "[Sweeper]"; } if (whichcert==13){ return "[Tactician]"; } if (whichcert==14){ return "[Turtle]"; } if (whichcert==15){ return "[Victor]"; } } else{ return "An error has happened"; } } string cratePrompt(){ string answer; bool valid=0; while (valid==0){ cout<<"Which crate do you want to open?\n\n"; cin>>answer; if ((answer=="1")||(answer=="c1")||(answer=="cc1")||(answer=="C1")||(answer=="CC1")){ return "CC1"; } else if ((answer=="2")||(answer=="c2")||(answer=="cc2")||(answer=="C2")||(answer=="CC2")){ return "CC2"; } else if ((answer=="3")||(answer=="c3")||(answer=="cc3")||(answer=="C3")||(answer=="CC3")){ return "CC3"; } else if ((answer=="4")||(answer=="c4")||(answer=="cc4")||(answer=="C4")||(answer=="CC4")){ return "CC4"; } else{ cout<<"Please enter a valid option (1, 2, 3, or 4)\n"; } } } string getitem(string crateNum){ string rarity = getrare(); //(Rarity)Select is used to select exactly which item they will get out of the possibilities. int BMSelect= getrand(1,6); int ESelect= getrand(1,2); int ISelect= getrand(1,3); int VRSelect= getrand(1,4); int RSelect= getrand(1,5); if (crateNum=="CC1"){ if (rarity=="Rare"){ if(RSelect==1){ return "(Takumi) Combo"; } else if(RSelect==2){ return "(Breakout) Vice"; } else if(RSelect==3){ return "(Dominus) Pollo Caliente"; } else if(RSelect==4){ return "(Dominus) Arcana"; } else if(RSelect==5){ return "(Breakout) Shibuya"; } } else if (rarity=="Very Rare"){ if (VRSelect==1){ return "(Takumi) Anubis"; } else if (VRSelect==2){ return "(Breakout) Dot Matrix"; } else if (VRSelect==3){ return "(Dominus) Snakeskin"; } else if (VRSelect==4){ return "Chakrams"; } } else if (rarity=="Import"){ if (ISelect==1){ return "Dominus GT"; } else if (ISelect==2){ return "Trinity"; } else if (ISelect==3){ return "Takumi RX-T"; } } else if (rarity=="Exotic"){ if (ESelect==1){ return "Photons"; } else if (ESelect==1){ return "Loopers"; } } else if (rarity=="Black Market"){ if (BMSelect==1){ return "Biomass"; } else if (BMSelect==2){ return "Heatwave"; } else if (BMSelect==3){ return "Hexed"; } else if (BMSelect==4){ return "Slipstream"; } else if (BMSelect==5){ return "Parallax"; } else if (BMSelect==6){ return "Labyrinth"; } } else{ cout<<"AN ERROR HAS OCCURED"; } } if (crateNum=="CC2"){ if (rarity=="Rare"){ if(RSelect==1){ return "(Octane) Dragon Lord"; } if(RSelect==2){ return "(Venom) Nine Lives"; } if(RSelect==3){ return "(Road Hog) Carbonated"; } if(RSelect==4){ return "(Takumi) Whizzle"; } if(RSelect==5){ return "(Merc) Narwhal"; } } else if (rarity=="Very Rare"){ if (VRSelect==1){ return "(Octane) Distortion"; } if (VRSelect==2){ return "(Merc) Warlock"; } if (VRSelect==3){ return "Polygonal"; } if (VRSelect==4){ return "(X-Devil) Snakeskin"; } } else if (rarity=="Import"){ if (ISelect==1){ return "X-Devil Mk2"; } if (ISelect==2){ return "Road Hog XL"; } if (ISelect==3){ return "Pixel Fire"; } } else if (rarity=="Exotic"){ if (ESelect==1){ return "Lightnings"; } if (ESelect==1){ return "Lobos"; } } else if (rarity=="Black Market"){ if (BMSelect==1){ return "Biomass"; } if (BMSelect==2){ return "Heatwave"; } if (BMSelect==3){ return "Hexed"; } if (BMSelect==4){ return "Slipstream"; } if (BMSelect==5){ return "Parallax"; } if (BMSelect==6){ return "Labyrinth"; } } else{ cout<<"AN ERROR HAS OCCURED"; } } if (crateNum=="CC3"){ if (rarity=="Rare"){ if(RSelect==1){ return "(Breakout) Falchion"; } if(RSelect==2){ return "(Breakout) Turbo"; } if(RSelect==3){ return "(Dominus) Mondo"; } if(RSelect==4){ return "(Octane) Shisa"; } if(RSelect==5){ return "(Masamune) Oni"; } } else if (rarity=="Very Rare"){ if (VRSelect==1){ return "(Takumi) Distortion"; } if (VRSelect==2){ return "(Breakout) Snakeskin"; } if (VRSelect==3){ return "Troikas"; } if (VRSelect==4){ return "(Octane) MG-88"; } } else if (rarity=="Import"){ if (ISelect==1){ return "Breakout Type-S"; } if (ISelect==2){ return "Hypernova"; } if (ISelect==3){ return "Dark Matter"; } } else if (rarity=="Exotic"){ if (ESelect==1){ return "Pulsus"; } if (ESelect==1){ return "Discotheques"; } } else if (rarity=="Black Market"){ if (BMSelect==1){ return "Biomass"; } if (BMSelect==2){ return "Heatwave"; } if (BMSelect==3){ return "Hexed"; } if (BMSelect==4){ return "Slipstream"; } if (BMSelect==5){ return "Parallax"; } if (BMSelect==6){ return "Labyrinth"; } } else{ cout<<"AN ERROR HAS OCCURED"; } } if (crateNum=="CC4"){ if (rarity=="Rare"){ if(RSelect==1){ return "(Octane) Dragon Lord"; } if(RSelect==2){ return "(Venom) Nine Lives"; } if(RSelect==3){ return "(Road Hog) Carbonated"; } if(RSelect==4){ return "(Takumi) Whizzle"; } if(RSelect==5){ return "(Merc) Narwhal"; } } else if (rarity=="Very Rare"){ if (VRSelect==1){ return "(Octane) Distortion"; } if (VRSelect==2){ return "(Merc) Warlock"; } if (VRSelect==3){ return "Polygonal"; } if (VRSelect==4){ return "(X-Devil) Snakeskin"; } } else if (rarity=="Import"){ if (ISelect==1){ return "X-Devil Mk2"; } if (ISelect==2){ return "Road Hog XL"; } if (ISelect==3){ return "Pixel Fire"; } } else if (rarity=="Exotic"){ if (ESelect==1){ return "Lightnings"; } if (ESelect==1){ return "Lobos"; } } else if (rarity=="Black Market"){ if (BMSelect==1){ return "Biomass"; } if (BMSelect==2){ return "Heatwave"; } if (BMSelect==3){ return "Hexed"; } if (BMSelect==4){ return "Slipstream"; } if (BMSelect==5){ return "Parallax"; } if (BMSelect==6){ return "Labyrinth"; } } else{ cout<<"AN ERROR HAS OCCURED"; } } } 代替perl。您可以使用几乎未经修改的 1 正则表达式,只需将其替换为grep 2

\1\x00

1 你的正则表达式有点奇怪,捕获组在grep命令的上下文中没有做任何事情(比如regex_multiline="(?s)(flag[[:blank:]].*?)(?=(?:\r\n|[\r\n])flag|\z)" perl -0777 -pe "s/$regex_multiline/\1\x00/g" < "$logfile" )。我只是将您想要匹配的整个部分放入一个组中,以便它与替换部分中的(flag)相对应。如果需要调整/我错过了什么。

2 使用\1(对于&#34;匹配组1&#34;,&#34;空字节&#34;)实际上也有效但看起来很亲切令人困惑的。

答案 1 :(得分:0)

我找到了解决方案。我想,这有点像黑客攻击,但它与grep版本2.20及更高版本一致。虽然不要使用它与grep 2.25及以上。 这是grep与参数-zon的组合: - z(将输入视为一组行,每个行以零字节结尾) - o(仅打印匹配行的匹配(非空)部分) - n(在输入文件中使用从1开始的行号前缀每行输出。)

此组合将在每次新比赛开始时输出“1:”。总是。 (不确定它是grep中的错误,还是设计错误,但选项-z和-o确实有意义)

1:flag test1
1:flag test2
1:flag test3
    test4
    test5
1:flag test6

test7

1:flag test8

因此,知道这一点,这将导致以下的find-and-replace函数将替换以1开头的每一行:使用空字节字符。请注意,每行末尾都需要一个空字节字符,因此我们必须手动为最后一行添加一个字符!

这可以通过以下方式完成:

sed -e's / ^ 1:/ \ x0 / g'| sed -e'$ a \ x0' 要么 awk'{gsub(/ ^ 1:/,“\ x0”);} 1'| sed -e'$ a \ x0'

(我认为sed对于这种操作更有效/更快,但不要因此而限制我。)

test2(){
    #This finally works!
    echo start
    while IFS= read -r -d '' line ; do
        printf '<test>%s</test>\n' "$line"
    done < <(grep -Pzon $regex_multiline $logfile | sed -e 's/^1:/\x0/g' | sed -e '$a\\x0' )
    echo end
}