Question

我正在尝试使用正则表达式将字符串拆分为两个字符串

例如

String original1 = "Calpol Plus 100MG";

上面的字符串应该分成

String string1 = "Calpol Plus";和String string2 = "100MG";

我尝试在字符串上使用.split(" ")方法，但仅当原始字符串为"Calpol 100MG"时才有效

由于我是正则表达式的新手，我搜索了一些正则表达式并将正则表达式设为"[^0-9MG]" 但它仍然不能处理像"Syrup 10ML"

这样的字符串

我想使用一般的正则表达式，它适用于两种类型的字符串。

Answer 1

只需根据<number>MG字符串或<number>ML字符串之前的一个或多个空格字符拆分输入。

string.split("\\s+(?=\\d+M[LG])");  // Use this regex "\\s+(?=\\d+(?:\\.\\d+)?M[LG])" if the there is a possibility of floating point numbers.

示例：

String original1 = "Calpol Plus 100MG"; String strs[] = original1.split("\\s+(?=\\d+M[LG])"); for (int i=0; i<strs.length; i++) { System.out.println(strs[i]); }

将结果分配给变量。

String original1 = "Calpol Plus 100MG"; String strs[] = original1.split("\\s+(?=\\d+M[LG])"); String string1 = strs[0]; String string2 = strs[1]; System.out.println(string1); System.out.println(string2);

输出：

Calpol Plus 100MG

代码2：

String original1 = "Syrup 10ML"; String strs[] = original1.split("\\s+(?=\\d+M[LG])"); String string1 = strs[0]; String string2 = strs[1]; System.out.println(string1); System.out.println(string2);

输出：

Syrup 10ML

<强>解释

\s+匹配一个或多个空格字符。

(?=\\d+M[LG])肯定前瞻断言匹配必须后跟一个或多个数字\d+，然后再跟MG或ML

<强> ReGex DEMO

Answer 2

尝试类似：

String original1 = "Calpol Plus 100MG";
Pattern p = Pattern.compile("[A-Za-z ]+|[0-9]*.*");
Matcher m = p.matcher(original1);
while (m.find()) {
      System.out.println(m.group());
}

Answer 3

我提出了两个解决方案：

您可以创建与整个字符串匹配的模式，并使用组来提取所需信息
您可以使用预测断言来确保在数字前面分割

哪种解决方案最适合您，取决于您拥有的各种输入。如果您使用组，您将始终找到最后一个金额部分。如果您使用拆分，您可以提取更复杂的数量组，如＆＃34; 2茶匙＆＃34; （使用第一个解决方案，您需要将[A-Za-z]类扩展为包含-，例如使用[-A-Za-z]代替）或＆＃34; 2.5L＆＃34; （使用第一个解决方案，您需要将[0-9]类扩展为包含.，例如使用[0-9.]代替更轻松。

来源：

import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created for http://stackoverflow.com/q/27329519/1266906
 */
public class RecipeSplitter {

    /**
     * {@code ^} the Pattern has to be applied from the start of the String on
     * {@code (.*)} match any characters into Group 1
     * {@code \\s+} followed by at least one whitespace
     * {@code ([0-9]+\s*[A-Za-z]+)} followed by Group 2 which is made up by at least one digit, optional whitespace and
     *                              at least one character
     * {@code $} the Pattern has to be applied so that at the End of the Pattern the End of the String is reached
     */
    public static final Pattern INGREDIENT_PATTERN                   = Pattern.compile("^(.*)\\s+([0-9]+\\s*[A-Za-z]+)$");
    /**
     * {@code \\s+} at least one whitespace
     * {@code (?=[0-9])} next is a digit (?= will ensure it is there but doesn't include it into the match so we don't
     *                   remove it
     */
    public static final Pattern WHITESPACE_FOLLOWED_BY_DIGIT_PATTERN = Pattern.compile("\\s+(?=[0-9])");

    public static void matchWholeString(String input) {
        Matcher matcher = INGREDIENT_PATTERN.matcher(input);
        if (matcher.find()) {
            System.out.println(
                    "\"" + input + "\" was split into \"" + matcher.group(1) + "\" and \"" + matcher.group(2) + "\"");
        } else {
            System.out.println("\"" + input + "\" was not of the expected format");
        }
    }

    public static void splitBeforeNumber(String input) {
        String[] strings = WHITESPACE_FOLLOWED_BY_DIGIT_PATTERN.split(input);
        System.out.println("\"" + input + "\" was split into " + Arrays.toString(strings));
    }

    public static void main(String[] args) {
        matchWholeString("Calpol Plus 100MG");
        // "Calpol Plus 100MG" was split into "Calpol Plus" and "100MG"
        matchWholeString("Syrup 10ML");
        // "Syrup 10ML" was split into "Syrup" and "10ML"
        splitBeforeNumber("Calpol Plus 100MG");
        // "Calpol Plus 100MG" was split into [Calpol Plus, 100MG]
        splitBeforeNumber("Syrup 10ML");
        // "Syrup 10ML" was split into [Syrup, 10ML]
    }
}

如何使用正则表达式拆分2个字符串？

3 个答案: