grep命令不会选择任何内容

时间:2018-06-19 13:23:43

标签: regex awk grep

我试图使用以下grep命令:

grep '(.*)(?=(png|html|jpg|js|css)(?:\s*))(png|html|jpg|js|css.*\s)' file

文件包含以下内容:

 http://manage.bostonglobe.com/GiftTheGlobe/LandingPage.html
 https://manage.bostonglobe.com/cs/mc/login.aspx?p1=BGFooter
 https://www.bostonglobe.com/bgcs
 /newsletters?p1=BGFooter_Newsletters
 https://bostonglobe.custhelp.com/app/home?p1=BGFooter
 https://bostonglobe.custhelp.com/app/answers/list?p1=BGFooter
 /tools/help/stafflist?p1=BGFooter
 https://www.bostonglobemedia.com/
 https://manage.bostonglobe.com/Order/newspaper/Newspaper.aspx
 https://www.facebook.com/globe
 https://twitter.com/#!/BostonGlobe
 https://plus.google.com/108227564341535363126/about
 https://epaper.bostonglobe.com/launch.aspx?pbid=2c60291d-c20c-4780-9829-     b3d9a12687cf
 http://nieonline.com/bostonglobe/
 https://secure.pqarchiver.com/boston-sub/no_default.html?ss=1&url=%2Fboston-sub%2Fadvancedsearch.html
 /tools/help/privacy?p1=BGFooter
 /tools/help/terms-service?p1=BGFooter
 /termsofpurchase?p1=BGFooter
 https://www.bostonglobemedia.com/careers
 /css/globe-print.css?v=19256I1935
 //meter.bostonglobe.com/css/style.css
 /css/globe-print.css?v=19256I1935
 //cdn.blueconic.net/bostonglobemedia.js
 /js/lib/rwd-images.js,lib/respond.min.js,lib/modernizr.custom.min.js,globe-          define.js,globe-controller.js?v=19256I1935
 data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///ywAAAAAAQABAAACAUwAOw==
 /js/lib/jquery.js,lib/lo-dash-custom-2.4.1.js,lib/a9.js,lib/pb.js,dist/ad-     init.js,globe-newsletter.js,globe-profile-page.js,dist/globe-topic-nav.js,dist/rakuten.js?v=19256I1935
 //dc8xl0ndzn2cb.cloudfront.net/js/bostonglobe/v0/keywee.min.js

由于某种原因,它没有从该文件中选择任何内容,我尝试了不同的标志,但似乎无法找出问题所在

1 个答案:

答案 0 :(得分:0)

您正在将PCRE正则表达式与POSIX BRE引擎(默认grep引擎)配合使用。

要使这些模式有效,应使用-P选项(在GNU grep中可用):

grep -P 'YOUR_PCRE_PATTERN'
     ^^

要开发和测试PCRE模式,通常建议使用知名的regex101.com

请注意,在Mac OS上,您可以install GNU grep via brew