如何在虚线内提取内容

时间:2015-12-21 20:31:41

标签: java regex java.util.scanner

我的文字在降价格式中有一些内容,它由两部分组成,第一部分由虚线包围,是元数据,其余部分在虚线之后是实际内容。

我的格式是这样的:

---
toc:
    customization:
        title: Customization
        themes: Themes
        plugins: Plugins
nav: 5
---

summary: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vel diam purus.
body:Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vel diam purus.`

我想在这些虚线中提取内容并将其存储在单独的HashMap中(不希望将这些破折号存储在地图中),并且类似地将hashMap分离为实际内容。

Scanner scanner = new Scanner(new FileReader("src/main/webapp/WEB-INF/content/" + url + ".md"));

HashMap<String, String> map = new HashMap<String, String>();

while (scanner.hasNextLine()) {
    String[] columns = scanner.nextLine().split(":");

    for (int i = 0; i < columns.length; i++) {
        if (!columns[i].isEmpty() && !columns[i].contains("---")) {
            map.put(columns[0], columns[1]);
        }
    }
}

scanner.close();
System.out.println(map);

有谁能告诉我如何在破折号中提取这些行并将其存储在单独的HashMap中并将实际内容存储在单独的HashMap中?

3 个答案:

答案 0 :(得分:3)

只需跟踪要使用的地图:

Map<String, String> regularMap = new HashMap<String, String>();
Map<String, String> separateMap = new HashMap<String, String>();

Map<String, String> currentMap = regularMap;
boolean inDashes = false;
while (scanner.hasNextLine()) {
    String line = scanner.nextLine();
    if(line.equals("---")) {
       // switch state
       inDashes = !inDashes;
       currentMap = inDashes ? separateMap : regularMap;
    } else {
        String[] columns = line.split(":");
        for (int i = 0; i < columns.length; i++) {
            if (!columns[i].isEmpty()) {
                currentMap.put(columns[0], columns[1]);
            }
        }
    }
}

答案 1 :(得分:0)

这个简单的代码应该可以正常工作。

public static void parseFile(String url) throws FileNotFoundException {
    Scanner scanner = new Scanner(new FileReader("src/main/webapp/WEB-INF/content/" + url + ".md"));

    // create a two maps
    HashMap<String, String> metaData = new HashMap();
    HashMap<String, String> content = new HashMap();

    // and a marker
    boolean isMetaData = false;

    while (scanner.hasNextLine()) {
        String nextLine = scanner.nextLine();
        if ("---".equals(nextLine)) { // if line equals "---" then
            isMetaData = !isMetaData; // switch a marker
            continue; // and skip this line
        }

        // add to proper map the value from the line
        addPropertyToMap(isMetaData ? metaData : content, nextLine.split(":"));
    }
    scanner.close();

    System.out.println(metaData);
    System.out.println(content);
}

private static void addPropertyToMap(HashMap<String, String> map, String[] columns){
    if(columns.length == 1){
        // if your key don't have value, replace it with an empty string
        map.put(columns[0].trim(), "");
    } else {
        map.put(columns[0].trim(), columns[1].trim());
    }
}

运行此代码时,您应该得到以下结果:

{themes=Themes, nav=5, customization=, plugins=Plugins, toc=, title=Customization}
{summary=Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vel diam purus., body=Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vel diam purus.}

答案 2 :(得分:0)

我认为解析和提取数据的最佳方法是使用 Model类,然后将数据解析为模型对象实例并将它们存储在列表

模型

public class Model {

    // Fields
    private String toc;
    private String customization;
    private String title;
    private String themes;
    private String plugins;
    private int nav;

    // No-arg Default Constructor
    public Model() {};

    // Getters & Setters
    public String getToc() {
        return toc;
    }

    public void setToc(String toc) {
        this.toc = toc;
    }

    public String getCustomization() {
        return customization;
    }

    public void setCustomization(String customization) {
        this.customization = customization;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getThemes() {
        return themes;
    }

    public void setThemes(String themes) {
        this.themes = themes;
    }

    public String getPlugins() {
        return plugins;
    }

    public void setPlugins(String plugins) {
        this.plugins = plugins;
    }

    public int getNav() {
        return nav;
    }

    public void setNav(int nav) {
        this.nav = nav;
    }
}

测试

import java.io.FileReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Scanner;

public class Test {

    private static final int OBJECT_LINE_LENGTH = 6;

    public static void main(String[] args) throws Exception {

        String url = "testFile";
        Scanner scanner = new Scanner(new FileReader("src/main/webapp/WEB-INF/content/" + url + ".md"));
        boolean isInContent = false;
        int lineCounter = 0;
        String lines[] = new String[OBJECT_LINE_LENGTH];
        List<Model> objectList = new ArrayList<Model>();

        while (scanner.hasNextLine()) {
            String line = scanner.nextLine();

            if(line.equals("---")) {
                if(isInContent == false) {
                    isInContent = true;
                    continue;
                } else {
                    isInContent = false;
                }
            }

            if(!isInContent) {
                continue;
            }

            lines[lineCounter++] = line;

            if(lineCounter == 6) {
                lineCounter = 0;
                objectList.add(parseObject(lines));
            }

        }
        scanner.close();

        printObjectList(objectList);
    }


    private static Model parseObject(String lines[]) {
        Model model = new Model();

        model.setToc(Arrays.copyOfRange(lines, 1, 4).toString());
        model.setCustomization(Arrays.copyOfRange(lines, 2, 4).toString());
        model.setTitle(lines[2].split(":")[1].trim());
        model.setThemes(lines[3].split(":")[1].trim());
        model.setPlugins(lines[4].split(":")[1].trim());
        model.setNav(Integer.parseInt(lines[5].split(":")[1].trim()));

        return model;
    }

    private static void printObjectList(List<Model> objectList) {
        for(Model m : objectList) {
            System.out.println();
            System.out.println("Title  : " + m.getTitle());
            System.out.println("Themes : " + m.getThemes());
            System.out.println("Plugins: " + m.getPlugins());
            System.out.println("Nav    : " + m.getNav());
        }
    }

}

用于测试的样本md文件

---
toc:
    customization:
        title: Customization
        themes: Themes
        plugins: Plugins
nav: 5
---
---
toc:
    customization:
        title: anotherCustomization
        themes: anotherThemes
        plugins: anotherPlugins
nav: 10
---
---
toc:
    customization:
        title: ThirdCustomization
        themes: ThirdTheme
        plugins: ThirdPlugin
nav: 15
---




---
toc:
    customization:
        title: fourthCustomization
        themes: fourthTheme
        plugins: fourthPlugin
nav: 20
---

测试输出

Title  : Customization
Themes : Themes
Plugins: Plugins
Nav    : 5

Title  : anotherCustomization
Themes : anotherThemes
Plugins: anotherPlugins
Nav    : 10

Title  : ThirdCustomization
Themes : ThirdTheme
Plugins: ThirdPlugin
Nav    : 15

Title  : fourthCustomization
Themes : fourthTheme
Plugins: fourthPlugin
Nav    : 20