Question

我想编写一个读取以下输入的程序：

<repeat value="2" content="helloworld"/>

现在我需要解析并将'repeat'，'2'和'helloword'存储在不同的变量中。到现在为止还挺好。问题是输入中的任何地方都可能存在空格，这使得任务变得更加困难并且超出了我的能力。我想也许可以使用正则表达式，但是我无法使用正则表达式，而我对该主题的研究没有产生任何结果。那么这将是一个聪明的方法呢？

示例：

< rep eat va lue=" 2"    conte nt= "helloworld"/>

To mach

repeat, 2, helloworld

Answer 1

使用此正则表达式涵盖所有可能的间距：

<\s*(\w+)\s+value\s*=\s*"(\w+)"\s*content\s*=\s*"(\w+)"\s*\/\s*>

这将匹配您给出的整个字符串作为示例并返回标记（第一组），值（第二组）和内容（第三组）。

Test it online at regex101.com

<强>更新

要在关键字value和content中添加空格，您只需在每个字母之间添加\s*（匹配任意数量的空白字符，包括零）：

<\s*(.+)\s+v\s*a\s*l\s*u\s*e\s*=\s*"(\w+)"\s*c\s*o\s*n\s*t\s*e\s*n\s*t\s*=\s*"(.+)"\s*\/\s*>

Test it online at regex101.com

Answer 2

我建议您使用DOM解析器，例如Jsoup。当然，输入应该是有效的xml / html

package com.example;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

public class AttributesReader {
    public static void main(String[] args) throws Exception {
        String xmlStrMessage="<repeat value=\"2\" content=\"helloworld\"/>";
        Document doc = Jsoup.parse(xmlStrMessage);
        Elements repeat = doc.select("repeat");
        System.out.println("value:"+repeat.attr("value"));
        System.out.println("content:"+repeat.attr("content"));
    }
}

Java匹配单个单词，可以用空格分隔，也可以不用空格分隔

2 个答案: