我有一个格式的文件:
011110659
A101 000001 $.45
031100762
1030 000001 $.45
071000288
1040 000003 $1.35
1040 000001 $.45
103100195
1030 000001 $33.45
J5BU 000001 $.45
我想使用正则表达式使其看起来像
011110659 A101 000001 $.45
031100762 1030 000001 $.45
071000288 1040 000003 $1.35
071000288 1040 000001 $.45
103100195 1030 000001 $33.45
103100195 J5BU 000001 $.45
我要做的是复制整行中只有一个字符串的文本,并将其附加到已经有多个字符串的后续行的前面。
我可以使用编程脚本执行此操作,但有没有办法使用正则表达式执行此操作?
答案 0 :(得分:0)
关于距离((?<=\n|^)\d+)\s+\r?\n
$1
替换{{1}}的距离最近。这里示范:http://regex101.com/r/qD5rJ6
答案 1 :(得分:0)
也许不像你希望的那样简洁,但这很有效。正则表达式重要,但不是关键。
顶部和设置:
import java.util.Arrays;
import java.util.ArrayList;
import java.util.regex.Pattern;
/**
<P>{@code java FormatDataLinesWithRegexXmpl}</P>
**/
public class FormatDataLinesWithRegexXmpl {
public static final void main(String[] igno_red) {
String sLS = System.getProperty("line.separator", "\\n");
StringBuilder sdInput = new StringBuilder().
append("011110659 ").append(sLS).
append(" ").append(sLS).
append(" A101 000001 $.45").append(sLS).
append(" ").append(sLS).
append("031100762 ").append(sLS).
append(" ").append(sLS).
append(" 1030 000001 $.45").append(sLS).
append(" ").append(sLS).
append("071000288 ").append(sLS).
append(" ").append(sLS).
append(" 1040 000003 $1.35").append(sLS).
append(" ").append(sLS).
append(" 1040 000001 $.45").append(sLS).
append(" ").append(sLS).
append("103100195 ").append(sLS).
append(" ").append(sLS).
append(" 1030 000001 $33.45").append(sLS).
append(" ").append(sLS).
append(" J5BU 000001 $.45").append(sLS);
主要逻辑:
//config
int iCOL1LEN = 9;
String sRE234 = "" +
"(\\w{4})[\\t ]+" + //Column 2
"(\\d{6})[\\t ]+" + //Column 3
"\\$(\\d*\\.*\\d+)$"; //Column 4
Pattern p234 = Pattern.compile(sRE234);
ArrayList<String> alLines = new ArrayList<String>(Arrays.asList(sdInput.toString().split(sLS)));
ArrayList<DataLine> aldl = new ArrayList<DataLine>();
System.out.println("read...START");
DataLine dlCurr = null;
int iLn = -1;
while(alLines.size() > 0) {
iLn++; //1st iteration: was -1, now 0
String sLine = alLines.remove(0).trim();
if(sLine.length() == 0) {
continue;
} else if(sLine.length() == iCOL1LEN) {
if(dlCurr != null) {
throw new IllegalStateException("[line " + iLn + "]: Found two column-1s in a row.");
}
dlCurr = new DataLine(sLine);
System.out.print("1");
} else if(p234.matcher(sLine).matches()) {
if(dlCurr == null) {
//Current 234-columns have no corresponding column-1.
//Use previous.
dlCurr = new DataLine(aldl.get(aldl.size() - 1).sCol1);
System.out.print("1");
}
dlCurr.set234(sLine);
aldl.add(dlCurr);
System.out.println("234");
dlCurr = null;
}
}
System.out.println("read...END");
System.out.println("Output:");
for(DataLine dl : aldl) {
System.out.println(dl.sCol1 + " " + dl.sCol2 + " " + dl.sCol3 + " " + dl.sCol4);
}
}
}
一个简单的数据持有者类:
class DataLine {
public String sCol1 = null;
public String sCol2 = null;
public String sCol3 = null;
public String sCol4 = null;
public DataLine(String s_col1) {
sCol1 = s_col1;
}
public void set234(String s_234) {
String[] as = s_234.split("[\t ]+");
sCol2 = as[0];
sCol3 = as[1];
sCol4 = as[2];
}
}
输出:
[C:\java_code\]java FormatDataLinesWithRegexXmpl
read...START
1234
1234
1234
1234
1234
1234
read...END
Output:
011110659 A101 000001 $.45
031100762 1030 000001 $.45
071000288 1040 000003 $1.35
071000288 1040 000001 $.45
103100195 1030 000001 $33.45
103100195 J5BU 000001 $.45