用于圆形更换的正则表达式

时间:2010-03-15 07:18:08

标签: java regex string replace

如何使用正则表达式编写函数来执行以下操作:

  1. 将小写'a'替换为大写,反之亦然
    • 如果易于扩展,请对所有字母执行此操作
  2. 如果单词由空格分隔,><是某些单词的特殊标记,请将>word替换为word<,反之亦然。
    • 如果有帮助,您可以限制输入,以便所有单词都必须以某种方式标记。
  3. 将preincrement(i++;)替换为preincrement(++i;),反之亦然。变量名称为[a-z]+。现在可以假设输入仅限于一堆这些语句。奖金:也是减少。
  4. 也对其他口味的解决方案感兴趣。


    注意:这是 NOT 一个家庭作业问题。另见我之前对正则表达式的探索:

3 个答案:

答案 0 :(得分:2)

正如你无疑已经收集到的那样,做这种事情的唯一明智的方法是在一次通过中进行所有替换,根据匹配的内容动态生成替换字符串。

Java似乎在今天的主要语言中是独一无二的,因为它没有提供方便的方法,但可以完成。您只需使用Matcher类提供的低级API。这是一个基于Elliott Hughes的权威Rewriter课程的演示:

import java.util.regex.*;

/**
 * A Rewriter does a global substitution in the strings passed to its
 * 'rewrite' method. It uses the pattern supplied to its constructor, and is
 * like 'String.replaceAll' except for the fact that its replacement strings
 * are generated by invoking a method you write, rather than from another
 * string. This class is supposed to be equivalent to Ruby's 'gsub' when
 * given a block. This is the nicest syntax I've managed to come up with in
 * Java so far. It's not too bad, and might actually be preferable if you
 * want to do the same rewriting to a number of strings in the same method
 * or class. See the example 'main' for a sample of how to use this class.
 *
 * @author Elliott Hughes
 */
public abstract class Rewriter
{
  private Pattern pattern;
  private Matcher matcher;

  /**
   * Constructs a rewriter using the given regular expression; the syntax is
   * the same as for 'Pattern.compile'.
   */
  public Rewriter(String regex)
  {
    this.pattern = Pattern.compile(regex);
  }

  /**
   * Returns the input subsequence captured by the given group during the
   * previous match operation.
   */
  public String group(int i)
  {
    return matcher.group(i);
  }

  /**
   * Overridden to compute a replacement for each match. Use the method
   * 'group' to access the captured groups.
   */
  public abstract String replacement();

  /**
   * Returns the result of rewriting 'original' by invoking the method
   * 'replacement' for each match of the regular expression supplied to the
   * constructor.
   */
  public String rewrite(CharSequence original)
  {
    this.matcher = pattern.matcher(original);
    StringBuffer result = new StringBuffer(original.length());
    while (matcher.find())
    {
      matcher.appendReplacement(result, "");
      result.append(replacement());
    }
    matcher.appendTail(result);
    return result.toString();
  }



  public static void main(String... args) throws Exception
  {
    String str = ">Foo baR<";

    // anonymous subclass example:
    Rewriter caseSwapper = new Rewriter("[A-Za-z]")
    {
      public String replacement()
      {
        char ch0 = group(0).charAt(0);
        char ch1 = Character.isUpperCase(ch0) ?
                   Character.toLowerCase(ch0) :
                   Character.toUpperCase(ch0);
        return String.valueOf(ch1);
      }
    };
    System.out.println(caseSwapper.rewrite(str));

    // inline subclass example:
    System.out.println(new Rewriter(">(\\w+)|(\\w+)<")
    {
      public String replacement()
      {
        return group(1) != null ? group(1) + "<"
                                : ">" + group(2);
      }
    }.rewrite(str));

  }
}

答案 1 :(得分:1)

执行此操作的最佳方法是使用正则表达式进行匹配,并使用回调进行替换。例如。在Python中:

import re

# First example
s = 'abcDFE'
print re.sub(r'\w', lambda x: x.group().lower()
                              if x.group().isupper()
                              else x.group().upper(), s)
# OUTPUT: ABCdfe

# Second example
s = '<abc dfe> <ghe <auo pio>'
def switch(match):
  match = match.group()
  if match[0] == '<':
    return match[1:] + '>'
  else:
    return '<' + match[:-1]
print re.sub(r'<\w+|\w+>', switch, s)
# OUTPUT: abc> <dfe ghe> auo> <pio

答案 2 :(得分:1)

Perl,也使用替换中的代码:

$\ = $/;

### 1.
$_ = 'fooBAR';

s/\w/lc $& eq $&? uc $&: lc $&/eg;

# this isn't a regex but better (in most cases):
# tr/A-Za-z/a-zA-Z/g;

print;
# FOObar

### 2.
$_ = 'foo >bar baz<';

s/>(\w+)|(\w+)</$1?"$1<":">$2"/eg;

print;
# foo bar< >baz

### 3.
$_ = 'x; ++i; i--;';

s/(--|\+\+)?\b([a-z]\w*)\b(?(1)|(--|\+\+))/$1?"$2$1":"$3$2"/eig;

print;
# x; i++; --i;