从字符串中读取特定单词并从中解析第二行

时间:2017-04-21 09:21:59

标签: java arrays

我在java中创建一个程序,我从css类.report获取html数据

@RequestMapping(value = "/medindiaparser",  method = RequestMethod.POST)
public ModelMap  medindiaparser(@RequestParam  String  urlofpage ) throws ClassNotFoundException, IOException  {
    System.out.println("saveMedicineName");
    ModelMap mv = new ModelMap(urlofpage);
    System.out.println();
     String url = urlofpage;
     Document document = Jsoup.connect(url).get();

        String TITLE = document.select(".report").text();
        String[] news = TITLE.split(":");
        System.out.println("Question: " + TITLE);


    return mv;
}

现在TITLE给了我什么。

name : aman kumar working in : home,outside what he does: program | sleep | eat

所以我想在数组中获取特定值,如。

array[0] : aman kumar
array[1] : home,outside
array[2] : program | sleep | eat

那么,我可以在我的模型中设置数组的值,有人做过吗?

.report包含<h3>标题所在的位置。它就像这样

<report><h3>Name</h3>aman kumar<h3>working in </h3>home, outside .....</report>

2 个答案:

答案 0 :(得分:1)

我完全彻底改变了我的答案,从name字符串中提取working inwhat he doesTITLE内容。这可以使用Java中的正则表达式模式匹配器来完成。

String pattern = "name\\s*:\\s*(.*?)\\s*working in\\s*:\\s*(.*?)\\s*what he does\\s*:\\s*(.*)";
Pattern r = Pattern.compile(pattern);
String line = "name : aman kumar working in : home,outside what he does: program | sleep | eat";
Matcher m = r.matcher(line);
while (m.find()) {
    System.out.println(m.group(1));
    System.out.println(m.group(2));
    System.out.println(m.group(3));
}

<强>输出:

aman kumar
home,outside
program | sleep | eat

在这里演示:

Rextester

答案 1 :(得分:0)

试试这个:

String s = "name : aman kumar working in : home,outside what he does: program | sleep | eat";
String[] news = s.split(":");
String exclude = "(working in|what he does)";
int index = -1;
for(int i = 0 ; i < news.length ; i++){
    if("name".equals(news[i].trim())){
        index = i;
        break;
    }
}
if(index != -1){
    String[] content = Arrays.copyOfRange(news, index+1, news.length);
    for(String string : content){
        System.out.println(string.trim().replaceAll(exclude, ""));
    }
}