我对番石榴的分裂可能性感兴趣:
Splitter.on("|").split("foo|bar|baz");
// => "foo", "bar", "baz"
这是正常的。
如果我想拆分“|”现在怎么办?但不是在“[”和“]”之间:
Splitter.on(something).split("foo|ba[r|ba]z");
// => "foo", "ba[r|ba]z"
根据我的理解,不可能在番石榴中定义这个“东西”。
我发现了这个: Issue 799: Add google escape library to Guava。这有关系吗?
答案 0 :(得分:3)
处理此问题的正确方法是创建解析器。现在非常简单,只需使用解析器组合器,例如JParsec。你会得到这样的东西:
class ParserFactory {
Parser escapedSequence() {
return Parsers.between(Scanners.string("["),
Scanners.anyCharacterButNot("]"), Scanners.string("]"));
}
Parser chunk() {
return Parsers.or(escapedSequence(), Scanners.anyCharacterButNot("|"));
}
Parsers wholeThing() {
return Parsers.separatedBy(chunk().plus(), Scanners.string("|"));
}
}
答案 1 :(得分:2)
这是适用于给定用例的代码(使用现有的Splitter代码作为参考)
public class Splitter {
private final CharMatcher trimmer;
private final CharMatcher startTextQualifier;
private final CharMatcher endTextQualifier;
private final Strategy strategy;
private Splitter(Strategy strategy, CharMatcher trimmer, CharMatcher startTextQualifier, CharMatcher endTextQualifier) {
this.strategy = strategy;
this.trimmer = trimmer;
this.startTextQualifier = startTextQualifier;
this.endTextQualifier = endTextQualifier;
}
private Splitter(Strategy strategy) {
this(strategy, CharMatcher.NONE, CharMatcher.NONE, CharMatcher.NONE);
}
public Splitter trimResults(CharMatcher trimmer) {
checkNotNull(trimmer);
return new Splitter(strategy, trimmer, startTextQualifier, endTextQualifier);
}
public Splitter ignoreIn(CharMatcher startTextQualifier, CharMatcher endTextQualifier) {
checkNotNull(startTextQualifier);
checkNotNull(endTextQualifier);
return new Splitter(strategy, trimmer, startTextQualifier, endTextQualifier);
}
public Splitter ignoreIn(char startTextQualifier, char endTextQualifier) {
return ignoreIn(CharMatcher.is(startTextQualifier), CharMatcher.is(endTextQualifier));
}
public Splitter trimResults() {
return trimResults(CharMatcher.WHITESPACE);
}
public static Splitter on(final CharMatcher separatorMatcher) {
checkNotNull(separatorMatcher);
return new Splitter(new Strategy() {
@Override public SplittingIterator iterator(Splitter splitter, final CharSequence toSplit) {
return new SplittingIterator(splitter, toSplit) {
@Override int separatorStart(int start) {
boolean wrapped = false;
for (int i = start; i < toSplit.length(); i++) {
/**
* Suppose start text qualifier = '[' and end text qualifier = ']' then following code
* doesn't address cases for multiple start-end combinations i.e it doesn't see whether
* end is properly closed e.g. for configuration like - {@code
* Splitter.on("|")..ignoreIn('[', ']').split("abc|[abc|[def]ghi]|jkl")
* results -> abc, [abc|[def]ghi], jkl
}
*/
if (!wrapped && startTextQualifier.matches(toSplit.charAt(i))) {
wrapped = true;
} else if (wrapped && endTextQualifier.matches(toSplit.charAt(i))) {
wrapped = false;
}
if (!wrapped && separatorMatcher.matches(toSplit.charAt(i))) {
return i;
}
}
return -1;
}
@Override int separatorEnd(int separatorPosition) {
return separatorPosition + 1;
}
};
}
});
}
public static Splitter on(final String separator) {
checkArgument(!separator.isEmpty(), "The separator may not be the empty string.");
checkArgument(separator.length() <= 2, "The separator's max length is 2, passed - %s.", separator);
if (separator.length() == 1) {
return on(separator.charAt(0));
}
return new Splitter(new Strategy() {
@Override public SplittingIterator iterator(Splitter splitter, CharSequence toSplit) {
return new SplittingIterator(splitter, toSplit) {
@Override public int separatorStart(int start) {
int delimiterLength = separator.length();
boolean wrapped = false;
positions:
for (int p = start, last = toSplit.length() - delimiterLength; p <= last; p++) {
for (int i = 0; i < delimiterLength; i++) {
if (startTextQualifier.matches(toSplit.charAt(i))) {
wrapped = !wrapped;
}
if (!wrapped && toSplit.charAt(i + p) != separator.charAt(i)) {
continue positions;
}
}
return p;
}
return -1;
}
@Override public int separatorEnd(int separatorPosition) {
return separatorPosition + separator.length();
}
};
}
});
}
public static Splitter on(char separator) {
return on(CharMatcher.is(separator));
}
public Iterable<String> split(final CharSequence sequence) {
checkNotNull(sequence);
return new Iterable<String>() {
@Override public Iterator<String> iterator() {
return spliterator(sequence);
}
};
}
private Iterator<String> spliterator(CharSequence sequence) {
return strategy.iterator(this, sequence);
}
private interface Strategy {
Iterator<String> iterator(Splitter splitter, CharSequence toSplit);
}
private abstract static class SplittingIterator extends AbstractIterator<String> {
final CharSequence toSplit;
final CharMatcher trimmer;
final CharMatcher startTextQualifier;
final CharMatcher endTextQualifier;
/**
* Returns the first index in {@code toSplit} at or after {@code start}
* that contains the separator.
*/
abstract int separatorStart(int start);
/**
* Returns the first index in {@code toSplit} after {@code
* separatorPosition} that does not contain a separator. This method is only
* invoked after a call to {@code separatorStart}.
*/
abstract int separatorEnd(int separatorPosition);
int offset = 0;
protected SplittingIterator(Splitter splitter, CharSequence toSplit) {
this.trimmer = splitter.trimmer;
this.startTextQualifier = splitter.startTextQualifier;
this.endTextQualifier = splitter.endTextQualifier;
this.toSplit = toSplit;
}
@Override
protected String computeNext() {
if (offset != -1) {
int start = offset;
int separatorPosition = separatorStart(offset);
int end = calculateEnd(separatorPosition);
start = trimStartIfRequired(start, end);
end = trimEndIfRequired(start, end);
if (start != end)
return toSplit.subSequence(start, end).toString();
}
return endOfData();
}
private int calculateEnd(int separatorPosition) {
int end;
if (separatorPosition == -1) {
end = toSplit.length();
offset = -1;
} else {
end = separatorPosition;
offset = separatorEnd(separatorPosition);
}
return end;
}
private int trimEndIfRequired(int start, int end) {
while (end > start && trimmer.matches(toSplit.charAt(end - 1))) {
end--;
}
return end;
}
private int trimStartIfRequired(int start, int end) {
while (start < end && trimmer.matches(toSplit.charAt(start))) {
start++;
}
return start;
}
}
}
小测试 -
public static void main(String[] args) {
Splitter splitter = Splitter.on("|").ignoreIn('[', ']');
System.out.println(Joiner.on(',').join(splitter.split("foo|ba[r|ba]z")));
// yields -> foo,ba[r|ba]z
}
请注意 - 此代码未经过测试且未解决所有情况,请根据需要随意修改。
答案 2 :(得分:0)
番石榴分离器非常强大,它可以处理正则表达式分隔符,它可以分割成地图等等。但是你想要实现的目标实际上超出了任何通用解析器的范围。
你想要一个带开/关开关的分配器。我认为唯一的方法是手动,类似这样:
List<String> ls=new ArrayList<String>();
int b=0;
int j=0;
String str="foo|ba[r|ba]z";
int e=str.indexOf('|');
do{
if(b>j)
{
j=str.indexOf('[',j);
while(j>0 && e>=j){
j=str.indexOf(']',j);
if (j<0){
ls.add(str.substring(b));
return ;
}
j=str.indexOf('[',j);
}
}
ls.add(str.substring(b,e));
System.out.println(str.substring(b,e));
b=++e;
e=str.indexOf('|',e);
} while( e >= 0);
(免责声明:此代码只是为了提供一个想法,它不起作用)