Question

我有一个包含多个网址的列表，例如

google.com
google.com/1
google.com/2
google.com/3
google.com/4
google.com/5
google.com/6
yahoo.com
yahoo.com/1
yahoo.com/2
yahoo.com/3
yahoo.com/4
yahoo.com/5
yahoo.com/6

如何删除保留google.com/3到6的第3个条目，雅虎也是如此？

Answer 1

在C＃中：

resultString = Regex.Replace(subjectString, 
    @"^        # Start at the start of a line
    [^/\r\n]+  # Match one or more characters except /
    $          # Match the end of the line, thereby ensuring that
               # the entire line does not contain a /
    (?:        # Match the following group:
     \r\n      # - a linebreak
     .*        # - an entire line
    ){2}       # exactly twice
    \r\n       # Match the final line break", 
    "", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

结果字符串：

google.com/3
google.com/4
google.com/5
google.com/6
yahoo.com/3
yahoo.com/4
yahoo.com/5
yahoo.com/6

Answer 2

我不确定正则表达式是最好的方法。但无论如何，这就是它：

s/(google.com[\s/\d]*){3}//
s/(yahoo.com[\s/\d]*){3}//

正则表达式用斜杠括起来，前面的s是vi表示法中的替换

正则表达式删除重复的网址

2 个答案: