如何使用换行/换行符对文件中的文本进行grep

时间:2019-03-20 14:28:52

标签: grep newline break

我必须用此内容解析多个文件的内容:

style=3D""><a href=3D"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ" style=3D"color:#3b599

我必须提取https链接,但是我的grep命令不能忽略新行返回,并以中继结果结尾:

命令

grep -r -m1 -oh "https://123456789.com/accounts/confirm_email*\s*[^ ]*" /folder/

结果

https://123456789.com/accounts/confirm_email/19AbCDx=

预期结果

https://123456789.com/accounts/confirm_email/19AbCDx=K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1MjkwODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ

PS:'='字符不是(总是)链接的一部分,但它是换行时文件的格式。

注意:https://123456789.com/accounts/confirm_email/是在所有文件中重复的链接的唯一常量。

如果我添加-z选项,则-m1选项将被忽略,结果是:

https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"

如果在命令似乎起作用后我添加了| head -3,但是在最后一行中重复了http

命令

grep -r -oh -z "https://123456789.com/accounts/confirm_email*\s*[^ ]*" /folder/ |head-3

https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=

如何排除它?

1 个答案:

答案 0 :(得分:1)

// this include brings std::ostream& operator<<(std::ostream&, int) // into scope and therefore you cannot define your own later #include<iostream> using namespace std; class xyz { public: int i; // needs body friend ostream & operator<<( ostream & Out , int) { return Out; } }; /* cant have this after including ostream ostream & operator<<(ostream & out , int i) { cout<<10+i<<endl; } */ int main() { xyz A; A.i=10; cout<<10; }

man grep

所以:

-z, --null-data
       Treat  the  input  as  a set of lines, each terminated by a zero
       byte (the ASCII NUL character) instead of a newline. - -

输出:

$ grep -z -r -m1 -oh "https://123456789.com/accounts/confirm_email*\s*[^ ]*" file

换行符仍然存在,但是您可以使用https://123456789.com/accounts/confirm_email/19AbCDx= K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&amp;ndid=3DHMTU1Mjk= wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"

删除它们