如何在分隔符文本文件中搜索和删除模式

时间:2011-11-19 05:15:09

标签: macos sed awk grep delimiter

我有以下文字:

s:50:"index.php?attachment=$matches[1]&cpage=$matches[2]";s:44:"(term-conditions-for-employers)/trackback/?$";s:35:"index.php?pagename=$matches[1]&tb=1";s:71:"(term-conditions-for-employers)/feed/(feed|rdf|rss|rss2|atom|jobman)/?$";s:47:"index.php?pagename=$matches[1]&feed=$matches[2]";s:66:"(term-conditions-for-employers)/(feed|rdf|rss|rss2|atom|jobman)/?$";s:47:"index.php?pagename=$matches[1]&feed=$matches[2]";s:52:"(term-conditions-for-employers)/page/?([0-9]{1,})/?$";s:48:"index.php?pagename=$matches[1]&paged=$matches[2]";s:59:"(term-conditions-for-employers)/comment-page-([0-9]{1,})/?$";s:48:"index.php?pagename=$matches[1]&cpage=$matches[2]";s:44:"(term-conditions-for-employers)(/[0-9]+)?/?$";s:47:"index.php?pagename=$matches[1]&page=$matches[2]";s:26:"home/attachment/([^/]+)/?$";s:32:"index.php?attachment=$matches[1]";s:36:"home/attachment/([^/]+)/trackback/?$";s:37:"index.php?attachment=$matches[1]&tb=1";s:63:"home/attachment/([^/]+)/feed/(feed|rdf|rss|rss2|atom|jobman)/?$";s:49:"index.php?attachment=$matches[1]&feed=$matches[2]";s:58:"home/attachment/([^/]+)/(feed|rdf|rss|rss2|atom|jobman)/?$";

我想要做的是搜索单词jobman并删除找到该单词的整个条目。每个条目的分隔符是分号“;”。我需要从Mac OS命令行执行此操作。所以我有grep,fgrep和awk等工具。

1 个答案:

答案 0 :(得分:1)

首先,我们需要从该文本中删除哪些内容?

$> grep -o -P "[^;]*jobman[^;]*;" ./text 
s:71:"(term-conditions-for-employers)/feed/(feed|rdf|rss|rss2|atom|jobman)/?$";
s:66:"(term-conditions-for-employers)/(feed|rdf|rss|rss2|atom|jobman)/?$";
s:63:"home/attachment/([^/]+)/feed/(feed|rdf|rss|rss2|atom|jobman)/?$";
s:58:"home/attachment/([^/]+)/(feed|rdf|rss|rss2|atom|jobman)/?$";

如果没错,那么

$> sed "s/[^;]*jobman[^;]*;//g" ./text 
s:50:"index.php?attachment=$matches[1]&cpage=$matches[2]";s:44:"(term-conditions-for-employers)/trackback/?$";s:35:"index.php?pagename=$matches[1]&tb=1";s:47:"index.php?pagename=$matches[1]&feed=$matches[2]";s:47:"index.php?pagename=$matches[1]&feed=$matches[2]";s:52:"(term-conditions-for-employers)/page/?([0-9]{1,})/?$";s:48:"index.php?pagename=$matches[1]&paged=$matches[2]";s:59:"(term-conditions-for-employers)/comment-page-([0-9]{1,})/?$";s:48:"index.php?pagename=$matches[1]&cpage=$matches[2]";s:44:"(term-conditions-for-employers)(/[0-9]+)?/?$";s:47:"index.php?pagename=$matches[1]&page=$matches[2]";s:26:"home/attachment/([^/]+)/?$";s:32:"index.php?attachment=$matches[1]";s:36:"home/attachment/([^/]+)/trackback/?$";s:37:"index.php?attachment=$matches[1]&tb=1";s:49:"index.php?attachment=$matches[1]&feed=$matches[2]";

我们在"s/[^;]*jobman[^;]*;//g"实际执行的操作是随时搜索[^;]*jobman[^;]*;符号组(不是:jobman,而不是:; )。比我们用''代替它。并对所有文本行进行替换。