删除匹配字符后的所有字符

时间:2014-11-30 09:28:33

标签: awk sed grep

我有一个包含多行的文件

http://example.com/part-1   this    number 1 one 
http://example.com/part--2  this is number 21 two
http://example.com/part10   this is an number 12 ten
http://example.com/part-num-11  this is an axample  number 212 eleven

如何删除第一个columd和“number x”之间“number x”+之后的所有字符...我想要这样的输出

http://example.com/part-1    1
http://example.com/part--2   21 
http://example.com/part10    12
http://example.com/part-num-11   212 

另一个案例: 输入:

http://server1.example.com/00/part-1    this    number 1 one 
http://server2.example.com/1a/part--2   this is section 21 two two
http://server3.example.com/2014/5/part10    this is an Part 12 ten  ten ten
http://server5.example.com/2014/7/part-num-11   this is an PARt number 212 eleven

我想要相同的输出....而且数字总是在最后一个数字字段

4 个答案:

答案 0 :(得分:1)

这是一种方式:

awk -F"number" '{split($1,a," ");split($2,b," ");print a[1],b[1]}' file
http://example.com/part-1 1
http://example.com/part--2 21
http://example.com/part10 12
http://example.com/part-num-11 212

如果您想拥有的号码始终位于倒数第二个字段,那么也应该这样做:

awk '{print $1,$(NF-1)}' file
http://example.com/part-1 1
http://example.com/part--2 21
http://example.com/part10 12
http://example.com/part-num-11 212

答案 1 :(得分:0)

sed -r 's/^([^0-9]*[0-9]+)[^0-9]*([0-9]+).*/\1 \2/' file

输出:

http://example.com/part-1 1
http://example.com/part--2 21
http://example.com/part10 12
http://example.com/part-num-11 212

答案 2 :(得分:0)

试试这个:

sed 's/ .*number \([0-9]+\).*/ \1/' myfile.txt

答案 3 :(得分:0)

感谢大家......从您的评论中,我有自己的解决方案:

sed -re 's/([0-9]*[0-9]+)/#\1#/g' | sed -re 's/(^.*#).*/\1/g' | sed 's/#//g' | awk '{print $1"  "$NF}'

我的想法:用#[numbers]#替换所有数字组,然后选择从行首到“#”的所有字符(sed将选择最后一个#)并删除所有其余字符。接下来是awk

谢谢大家(y)