我有一份文件。
文档中的所有算术字符(+
,-
,*
,/
);我想替换为他们的名字(add
,sub
,mult
,div
),除非这些字符出现在双引号内。
例如:
a + b;
"a + b";
输出:
a add b;
"a + b";
您可以将该文档视为一个C程序,我想在其中进行算术运算并将其转换为其含义(add
,sub
,...)但我不是想要处理算术运算,如果它在双引号内。
如何使用Java正则表达式捕获它?
答案 0 :(得分:0)
以下正则表达式(try it on regex101)
[^\"].*(\+|\-|\*|\/).*[^\"]\;
匹配:
[^\"]
- 任何不是"
.*
- 后跟任何内容
(\+|\-|\*|\/)
- 捕获组。捕获+
,-
,*
或/
.*
- 后跟任何内容
[^\"]
- 任何不是"
由于您在Java中使用它,因此您必须再次转义所有正斜杠。
兼容Java的REGEX:
"[^\\\"].*(\\+|\\-|\\*|\\/).*[^\\\"]\\;"
答案 1 :(得分:0)
您可以使用String replaceAll方法
示例:
public class MyTest {
/**
* @param args
*/
public static void main(String[] args) {
String operation = "\"a + b / c\"";
String result = operation.replaceAll("\\+", "add").replaceAll("\\/", "div");
System.out.println(result);
}
}
将输出:"添加b div c"
答案 2 :(得分:0)
这种看起来像家庭作业,但对我来说这看起来更像是一个正则表达式问题而不是java问题。
以下是如何在原始问题中实现请求输出的示例:
$ cat TestRegex.java
public class TestRegex {
public static void main(final String[] args) {
String inputString = "c = a + b; cout << \" a + b \";";
System.out.println("inputString: '" + String.valueOf(inputString) + "'.");
System.out.println("replace (+, add) ex: '" + String.valueOf(inputString.replaceAll("(\\+)(?=(?:[^\"]|\"[^\"]*\")*$)", "add")) + "'.");
System.out.println("replacedAll: '" +
String.valueOf(
inputString
.replaceAll("(\\+)(?=(?:[^\"]|\"[^\"]*\")*$)", "add")
.replaceAll("(\\-)(?=(?:[^\"]|\"[^\"]*\")*$)", "sub")
.replaceAll("(\\*)(?=(?:[^\"]|\"[^\"]*\")*$)", "mult")
.replaceAll("(\\/)(?=(?:[^\"]|\"[^\"]*\")*$)", "div")
) + "'."
);
}
}
使用示例输出:
$ java TestRegex
inputString: 'c = a + b; cout << " a + b ";'.
replace (+, add) ex: 'c = a add b; cout << " a + b ";'.
replacedAll: 'c = a add b; cout << " a + b ";'.
进一步修改输入字符串并再次测试,我得到:
$ java TestRegex
inputString: 'c = a + b + d / e * f - g; cout << " a + b ";'.
replace (+, add) ex: 'c = a add b add d / e * f - g; cout << " a + b ";'.
replacedAll: 'c = a add b add d div e mult f sub g; cout << " a + b ";'.
希望这有帮助。
答案 3 :(得分:0)
我的大脑受伤了。
<强> test1.c 强>
#include <stdio.h>
#define MACRO1 a+b-c*d/e
#define MACRO2 "a+b-c*\
d/e\
a + b - c * d / e \
", a+b-c*d/e
const char* str = "\
as\"\\/\"-\\\\+df\
\\""asdf+""-*\\""/""\
";
/* comment + - * /
a / b * c - d + e */
// comment + - * / blah "+ - * /" \+\\\
char dq = '"'+0*'\"'/'\'';
char* cp = &dq;
const int* ip1;
int const* ip2;
int const * const * ip3;
void* vp;
long (*fp)(int(*));
char *a[23], *ZXCV,
**p2p
,***(xx),
* * c, * * * d[34];
struct s {};
typedef struct s* blah;
struct s s1,*S2;
enum E {E1};
enum E* e;
union U {};
union U* u;
int main(void) {
int x = 1+2-3*4/5;
x++ +1; ++x+2;
x-- -3; --x+4;
x = - --x;
x = + ++x;
x = -- x -3;
x = ++ x +4;
x += +1;
x -= -1;
x *= +1;
x /= -1;
ip1 = (int const *)str;
int y = * ip1;
blah *pblah; // can't recognize typedef
#define OPAQUE int a =
OPAQUE*ip1; // can't recognize macro
printf("test: %d %s\n", x, str );
} // end main() 1+2-3*4/5
<强> COpSub.java 强>
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.Map;
import java.util.HashMap;
import java.nio.file.Files;
import java.nio.file.Paths;
public class COpSub {
public static void main(String[] args) throws Exception {
if (args.length != 2) { System.err.println("error: require two arguments."); System.exit(1); }
String fileName = args[0];
String encoding = args[1];
String source = readFile(fileName,encoding);
System.out.print(sub(source));
System.exit(0);
} // end main()
public static String sub(String s) throws Exception {
Map<String,String> m = new HashMap<String,String>();
// note: replacements must be escaped for appendReplacement()!
m.put("+","plus");
m.put("+=","plusequals");
m.put("-","minus");
m.put("-=","minusequals");
m.put("*","mul");
m.put("*=","mulequals");
m.put("/","div");
m.put("/=","divequals");
m.put("++","plusplus");
m.put("--","minusminus");
String typeAlternation = "void|char|signed\\s+char|unsigned\\s+char|short|short\\s+int|signed\\s+short|signed\\s+short\\s+int|unsigned\\s+short|unsigned\\s+short\\s+int|int|signed|unsigned|signed\\s+int|unsigned\\s+int|long|long\\s+int|signed\\s+long|signed\\s+long\\s+int|unsigned\\s+long|unsigned\\s+long\\s+int|long\\s+long|long\\s+long\\s+int|signed\\s+long\\s+long|signed\\s+long\\s+long\\s+int|unsigned\\s+long\\s+long|unsigned\\s+long\\s+long\\s+int|float|double|long\\s+double|(?:struct|enum|union)\\s+\\w+|const";
String safeCluster = ""
+"(?:" // overarching cluster
+ "\\s*" // skip over all leading whitespace to get to the interesting stuff
+ "(?:" // safe extent alternation
+ "(?:"+typeAlternation+")(?:\\s*\\*)+" // safe extent #1: pointer (to pointer to pointer...) of type
+ "|[+-]\\s+[+-]" // safe extent #2: non-lvalue requiring whitespace separation
+ "|[~!%^&*(=+\\[|;,?-](?:\\s*(?!\\+\\+|--)[+*-](?<!\\+\\+|--))++" // safe extent #3: non-lvalue not requiring whitespace separation -- can't include slash -- must be possessive to not give back part of pre/post increment/decrement -- fix broken vim syntax highlighting: )]
+ "|[^'\"+*/-]" // safe extent #4: guarantee safe punctuation char
+ ")" // end safe extent alternation
+")*+" // possessive gobble of safe extents
;
Pattern pattern = Pattern.compile(""
+"\\G" // start from previous match, or start-of-string for first search
+"(" // capture prefix
+ safeCluster // possessive gobble of safe extents
+ "(?:" // possessive zero-or-more unsafe extent clusters (possessive required for final match to not give up slash pattern)
+ "(?:" // unsafe extent cluster alternation (no suffix)
+ "'\\\\?.'" // unsafe extent #1: single-quoted char
+ "|\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"" // unsafe extent #2: double-quoted string; note this has its own internal safe gobble followed by zero-or-more unsafe extent cluster with suffix
+ "|/\\*[^*]*(?:\\*[^/][^*]*)*\\*/" // unsafe extent #3: traditional C comments; ditto
+ "|//[^\\n]*\\n" // unsafe extent #4: modern C comments; ditto
+ ")" // end unsafe extent cluster alternation (no suffix)
+ safeCluster // unsafe extent cluster safe suffix
+ ")*+" // end possessive zero-or-more unsafe extent clusters
+")" // end capture prefix
+"(\\+\\+|--|[+*/-]=?)" // capture operator
, Pattern.DOTALL );
StringBuffer b = new StringBuffer();
Matcher matcher = pattern.matcher(s);
boolean lastMatchWasOpAssign = false;
while (matcher.find()) {
if (lastMatchWasOpAssign)
matcher.appendReplacement(b, "$1$2" );
else
matcher.appendReplacement(b, "$1 "+m.get(matcher.group(2))+' ' );
lastMatchWasOpAssign = matcher.group(2).length() == 2 && matcher.group(2).charAt(1) == '=';
} // end while
matcher.appendTail(b);
return b.toString();
} // end sub()
public static String readFile(String fileName, String encoding ) throws Exception {
byte[] encoded = Files.readAllBytes(Paths.get(fileName));
return new String(encoded, encoding );
} // end readFile()
} // end class COpSub
<强>演示强>
> gcc test1.c -o test1;
> ./test1;
test: -4 as"\/"-\\+df\asdf+-*\/
> javac COpSub.java;
> CLASSPATH=. java COpSub test1.c UTF-8;
#include <stdio.h>
#define MACRO1 a plus b minus c mul d div e
#define MACRO2 "a+b-c*\
d/e\
a + b - c * d / e \
", a plus b minus c mul d div e
const char* str = "\
as\"\\/\"-\\\\+df\
\\""asdf+""-*\\""/""\
";
/* comment + - * /
a / b * c - d + e */
// comment + - * / blah "+ - * /" \+\\\
char dq = '"' plus 0 mul '\"' div '\'';
char* cp = &dq;
const int* ip1;
int const* ip2;
int const * const * ip3;
void* vp;
long (*fp)(int(*));
char *a[23], *ZXCV,
**p2p
,***(xx),
* * c, * * * d[34];
struct s {};
typedef struct s* blah;
struct s s1,*S2;
enum E {E1};
enum E* e;
union U {};
union U* u;
int main(void) {
int x = 1 plus 2 minus 3 mul 4 div 5;
x plusplus plus 1; plusplus x plus 2;
x minusminus minus 3; minusminus x plus 4;
x = - minusminus x;
x = + plusplus x;
x = minusminus x minus 3;
x = plusplus x plus 4;
x plusequals +1;
x minusequals -1;
x mulequals +1;
x divequals -1;
ip1 = (int const *)str;
int y = * ip1;
blah mul pblah; // can't recognize typedef
#define OPAQUE int a =
OPAQUE mul ip1; // can't recognize macro
printf("test: %d %s\n", x, str );
} // end main() 1+2-3*4/5
<强> DIFF 强>