我想用不在括号或方括号中的逗号分隔字符串
我正在使用以下字符串
土豆,植物油(向日葵,玉米和/或菜籽油),蜂蜜 烧烤调料[糖,盐,葡萄糖,Torula酵母,洋葱粉, 香料],麦芽糖糊精果糖,酵母提取物,糖蜜,天然香料 [含牛奶],玉米淀粉,蜂蜜,阿拉伯胶,辣椒提取物, 焦糖色,大蒜粉,柠檬酸和葵花籽油
我希望如何拆分(+
表示我希望拆分发生在哪里)
土豆+植物油(向日葵,玉米和/或菜籽油)+蜂蜜烧烤调味料[糖,盐,葡萄糖,Torula酵母,洋葱粉,香料] +麦芽糖糊精果糖+酵母提取物+糖蜜+天然香料[含牛奶] +玉米淀粉+蜂蜜+阿拉伯胶+辣椒粉+焦糖色+大蒜粉+柠檬酸+葵花籽油
我最接近的工作是这个
,(?![^\[\(]*[$\]\)])
答案 0 :(得分:3)
也许你想要这样的东西:
(?!<(?:\(|\[)[^)\]]+),(?![^(\[]+(?:\)|\]))
当输入到Java时(注意在随机位置插入额外的]
和(
以使其格式正确):
土豆,植物油(向日葵,玉米和/或菜籽油),蜂蜜烧烤调味料[糖,盐,葡萄糖,Torula酵母],洋葱粉,香料,麦芽糖糊精果糖,酵母提取物,糖蜜,天然香料[包括牛奶],玉米淀粉,蜂蜜,阿拉伯树胶,辣椒粉提取物,焦糖色素(大蒜粉,柠檬酸和葵花籽油)。
它产生输出:
Potatoes
Vegetable Oil (Sunflower, Corn, And/or Canola Oil)
Honey BBQ Seasoning [Sugar, Salt, Dextrose, Torula Yeast]
Onion Powder
Spices
Maltodextrin Fructose
Yeast Extract
Molasses
Natural Flavor [Including Milk]
Corn Starch
Honey
Gum Arabic
Paprika Extracts
Caramel Color (Garlic Powder, Citric Acid, And Sunflower Oil).
这正是“在顶级逗号中拆分”。
但请注意,此正则表达式效率非常低。用正则表达式计算括号并不是一个好主意。似乎可以通过简单的扫描左后跟简单拆分来解决。
答案 1 :(得分:2)
有时候,你最好不要搜索你想要的东西(即白名单),而不是试图找到你想要的东西之间的分裂点(即黑名单):
String haystack = "Potatoes, Vegetable Oil (Sunflower, Corn, And/or Canola Oil), "
+ "Honey BBQ Seasoning [Sugar, Salt, Dextrose, Torula Yeast], Onion Powder, "
+ "Spices, Maltodextrin Fructose, Yeast Extract, Molasses, "
+ "Natural Flavor [Including Milk], Corn Starch, Honey, Gum Arabic, "
+ "Paprika Extracts, Caramel Color (Garlic Powder, Citric Acid, And Sunflower Oil).";
Matcher m = Pattern.compile("\\w[^\\[(,]*(\\[[^]]*\\]|\\([^)]*\\))?")
.matcher(haystack);
while (m.find()) {
System.out.println("'" + m.group() + "'");
}
输出:
'Potatoes'
'Vegetable Oil (Sunflower, Corn, And/or Canola Oil)'
'Honey BBQ Seasoning [Sugar, Salt, Dextrose, Torula Yeast]'
'Onion Powder'
'Spices'
'Maltodextrin Fructose'
'Yeast Extract'
'Molasses'
'Natural Flavor [Including Milk]'
'Corn Starch'
'Honey'
'Gum Arabic'
'Paprika Extracts'
'Caramel Color (Garlic Powder, Citric Acid, And Sunflower Oil)'
请注意,生成的字符串不包含任何前导或尾随空格。
正则表达式解释:
"\w[^\[(,]*(\[[^]]+\]|\([^)]*\))?"
- 反斜杠转义处理后
"\w "
- 找一封信
" [^\[(,]* "
- ...除了[
(
或,
之外的任何内容
" ( | )?"
- ...可选地后面跟着:
" \[ \] "
- ......括号内的东西
" [^]]* "
- .........除]
之外的任何事物
" \( \) "
- ......或括号内的东西
" [^)]* "
- .........除了)