我有一个像这样的字符串:
OG=ACC-0000000009| AMBORFFA KIRI|P.O.BOX 1FAF6GPO,GPO,FG/FFERER OB=XXXX-XXCC|ABC|14332 X HWay|Vica |MNSJD IS=BIC-dfsgdf|asas nduf|142 ERRET ERT RET|ERTERT Island|ERTERT BF=ACC-0000013417711DD028|534 DFG ION|ONE DALLAERRS CENTER RR N.| ERTERT, SUITE 1300, DRRALLAS,|Pb, 75201/ PBB DT=GREAT CHART|0000FGHFGGL028434
在OG,OB,IS等值之间没有定界符。我想根据'='大致拆分数组,以便OG,OB ...字段包含在结果拆分中。我需要为子字段进一步处理这些字段。
答案 0 :(得分:1)
像这样? (标量代码)
val str =
"OG=ACC-0000000009| AMBORFFA KIRI|P.O.BOX 1FAF6GPO,GPO,FG/FFERER OB=XXXX-XXCC|ABC|14332 X HWay|Vica |MNSJD IS=BIC-dfsgdf|asas nduf|142 ERRET ERT RET|ERTERT Island|ERTERT BF=ACC-0000013417711DD028|534 DFG ION|ONE DALLAERRS CENTER RR N.| ERTERT, SUITE 1300, DRRALLAS,|Pb, 75201/ PBB DT=GREAT CHART|0000FGHFGGL028434 "
str.split("(?=\\S\\S=)")
.foldLeft(Map.empty[String,Array[String]]){
case (m,s) => m+(s.take(2) -> s.drop(3).split("\\|"))
}
//res0: Map[String,Array[String]] =
// HashMap(OG -> Array(ACC-0000000009, " AMBORFFA KIRI", "P.O.BOX 1FAF6GPO,GPO,FG/FFERER ")
// , OB -> Array(XXXX-XXCC, ABC, 14332 X HWay, "Vica ", "MNSJD ")
// , DT -> Array(GREAT CHART, "0000FGHFGGL028434 ")
// , IS -> Array(BIC-dfsgdf, asas nduf, 142 ERRET ERT RET, ERTERT Island, "ERTERT ")
// , BF -> Array(ACC-0000013417711DD028, 534 DFG ION, ONE DALLAERRS CENTER RR N., " ERTERT, SUITE 1300, DRRALLAS,", "Pb, 75201/ PBB "))
更新:每个注释都添加了新要求。
val str =
"OG=ACC-0000000009| AMBORFFA KIRI|P.O.BOX 1FAF6GPO,GPO,FG/FFERER Transaction Amount= 1223|546SD|376KL OB=XXXX-XXCC|ABC|14332 X HWay|Vica |MNSJD IS=BIC-dfsgdf|asas nduf|142 ERRET ERT RET|ERTERT Island|ERTERT BF=ACC-0000013417711DD028|534 DFG ION|ONE DALLAERRS CENTER RR N.| ERTERT, SUITE 1300, DRRALLAS,|Pb, 75201/ PBB DT=GREAT CHART|0000FGHFGGL028434 "
str.split(raw"\b(?=Transaction Amount=|\S\S=)")
.foldLeft(Map.empty[String,Array[String]]){
case (m,s) => val (k,v) = s.splitAt(s.indexOf("="))
m + (k -> v.tail.split("\\|"))
}
//HashMap(OG -> Array(ACC-0000000009, " AMBORFFA KIRI", "P.O.BOX 1FAF6GPO,GPO,FG/FFERER ")
// , OB -> Array(XXXX-XXCC, ABC, 14332 X HWay, "Vica ", "MNSJD ")
// , Transaction Amount -> Array(" 1223", 546SD, "376KL ")
// , DT -> Array(GREAT CHART, "0000FGHFGGL028434 ")
// , IS -> Array(BIC-dfsgdf, asas nduf, 142 ERRET ERT RET, ERTERT Island, "ERTERT ")
// , BF -> Array(ACC-0000013417711DD028, 534 DFG ION, ONE DALLAERRS CENTER RR N., " ERTERT, SUITE 1300, DRRALLAS,", "Pb, 75201/ PBB "))
答案 1 :(得分:1)
正则表达式可能是解决方案之一。但是我建议尽可能使用定界符。
这是我的解决方案,但不确定在所有情况下是否都可以使用
public static void main(String[] args){
String text = "OG=ACC-0000000009| AMBORFFA KIRI|P.O.BOX 1FAF6GPO,GPO,FG/FFERER OB=XXXX-XXCC|ABC|14332 X HWay|Vica |MNSJD IS=BIC-dfsgdf|asas nduf|142 ERRET ERT RET|ERTERT Island|ERTERT BF=ACC-0000013417711DD028|534 DFG ION|ONE DALLAERRS CENTER RR N.| ERTERT, SUITE 1300, DRRALLAS,|Pb, 75201/ PBB DT=GREAT CHART|0000FGHFGGL028434";
//Regex for field
String regexField = "(?<field>[A-Z]+)(:?[=])";
Pattern pattern = Pattern.compile(regexField);
Matcher matcher = pattern.matcher(text);
//extract fields names
List<String> fields = new ArrayList<>();
while(matcher.find()){
fields.add(matcher.group("field"));
}
//extract values using split and regex for fields
List<String> values = Arrays.stream(text.split(regexField))
.map(String::trim)
.filter(e -> !e.isEmpty())
.collect(Collectors.toList());
//group fields and values
Map<String, String> data = new HashMap<>();
if(fields.size() == values.size()){
for(int i = 0; i < fields.size(); i++){
data.put(fields.get(i), values.get(i));
}
}else{
System.out.println("Size are different. Something is not good.");
}
data.forEach((k, v) -> System.out.println(k + " -> " + v));
}
答案 2 :(得分:0)
<div class="text">MnO<sub>2</sub></div>
<div class="arrow">⟶</div>
答案 3 :(得分:0)
这是Scala regex解决方案:
val data = "OG=ACC-0000000009| AMBORFFA KIRI|P.O.BOX 1FAF6GPO,GPO,FG/FFERER..."
data.split("""(?=\w\w=)""")
这使用超前模式在后面紧跟两个单词字符和一个=
符号的点处分割数据。