我有一系列的行如下(可以按任意顺序排列)
Distal latency 4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A
% failed Chicago Classification 70 1 1 0 1 1 1 1 0 0 1
% panesophageal pressurization 0 0 0 0 0 0 0 0 0 0 0
% premature contraction 20 0 0 1 0 0 0 0 0 1 0
% rapid contraction 10 0 0 1 0 0 0 0 0 0 0
% large breaks 10 0 0 0 0 0 0 0 1 0 0
% small breaks 10 0 0 1 0 0 0 0 0 0 0
我想最终将行标题和每个值提取到Hash中,如下所示
Distallatency=4.9,Distallatency=N/A etc.
failedChicagoClassification1=70,failedChicagoClassification1=1,failedChicagoClassification1=1,failedChicagoClassification1=0,failedChicagoClassification1=1 etc.
and so on
我这样做的策略是:
1. join the words together by replacing the \s between words
2. End the joined word with a character eg : so I can then split each line into an array based on \s
3. Loop through the array adding the line title to each value into a Hash
这是我到目前为止所做的:
Pattern match_patternSwallow2 = Pattern.compile("(?:.*\\d+\\.\\d|N\\/A|\\d*){4,50}");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);
while (matchermatch_patternSwallow2.find()){
String found = matchermatch_patternSwallow2.group(0).trim();
System.out.println(found);
//Join up the words so can then split by space
found = found.replaceAll("([A-Za-z]+)\\s", "$1_").replaceAll("\\s", ":");
List<String> myList = new ArrayList<String>(Arrays.asList(found.split(":")));
for (int ff=1;ff<myList.size();ff++){
mapSwallow.put(myList.get(0)+"MapSwallowsNum"+ff,myList.get(ff));
}
}
捕获时没有错误,但只返回System.out行的空字符串。
我做错了什么?
答案 0 :(得分:1)
我可以建议使用以下正则表达式来获得符合条件的每一行:
"(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)*)$"
请参阅regex demo
<强>详情:
(?m)
- ^
- 开始行\\W*
- 0+非单词字符([a-zA-Z].*?)
- (第1组)一封信,其后跟除了换行符之外的任何0 +字符,尽可能少\\s*
- 零个或多个空格((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)*)
- 第2组捕获0+个数字序列(后跟点和数字可选)或N/A
后跟0 +空格$
- 行尾。找到匹配项后,请使用.group(1).replaceAll("\\s+","")
作为键,然后将.group(2)
与.split("\\s+")
分开以获取值。
查看示例在线代码:
String s = "Distal latency 4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A\n\n % failed Chicago Classification 70 1 1 0 1 1 1 1 0 0 1\n\n % panesophageal pressurization 0 0 0 0 0 0 0 0 0 0 0\n\n % premature contraction 20 0 0 1 0 0 0 0 0 1 0\n\n % rapid contraction 10 0 0 1 0 0 0 0 0 0 0\n\n % large breaks 10 0 0 0 0 0 0 0 1 0 0\n\n % small breaks 10 0 0 1 0 0 0 0 0 0 0";
Pattern match_patternSwallow2= Pattern.compile("(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)*)$");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);
HashMap<String, String> mapSwallow = new HashMap<String, String>();
while (matchermatch_patternSwallow2.find()){
String[] myList = matchermatch_patternSwallow2.group(2).split("\\s+");
String p1 = matchermatch_patternSwallow2.group(1).replaceAll("\\s+", "");
int line = 1;
for (String p2s: myList){
mapSwallow.put(p1+line, p2s);
line++;
}
}
System.out.println(mapSwallow);