通用Xml / JSON解析器(最好是SAX或STAX):通过配置文件根据每种不同的xml格式提供节点/ xpath信息

时间:2018-07-20 06:46:14

标签: java xml xml-parsing sax stax

我写这篇文章的目的是为了识别以下问题和解决方案的任何现有工具/代码。

要求/期望::解析30多个具有不同数据结构的xml日志文件,并从这些xml中填充一种通用格式。

示例(XML数据):

XML1: <xml>
        <name>abcd</name>
        <mark>10</mark>
        <employee_org_name>org1</employee_org_name>
        <employee_payroll>true</employee_payroll>
      </xml>

XML2: <xml>
        <employee_name>deft</employee_name>
        <score>10</score>
        <org>
            <name>org2</name>
            <payroll>false</payroll>
        </org>
      </xml>

XML3: <xml>
        <org name="org1">
            <employee>
                <name>ryan</name>
                <score>10</score>
            </employee>
            <name>org3</name>
            <payroll>true</payroll>
        </org>
      </xml>

配置文件看起来像XML:

XML1 settings file 
    Employee_name: /name
    Employee_mark: /mark
    Employee_org: /employee_org_name
    Employee_org_Payroll: /employee_payroll
    Employee_extras: anything

XML2 settings file:
    Employee_name: /employee_name
    Employee_mark: /score
    Employee_org: /org/name
    Employee_org_Payroll: /org/payroll
    Employee_extras: anything

XML3 settings file: 
    Employee_name: /org/employee/name
    Employee_mark: /org/employee/score
    Employee_org: /org:name
    Employee_org_Payroll: /org/boolean
    Employee_extras: anything

示例(JSON):

JSON1: {name: "abcd", mark: "10", "employee_org_name": "org1", "employee_payroll":  "org1"}

JSON2: {employee_name: "deft", score: "10", "details": {name: "org1","payroll" "true"}}

JSON3: {"org1" : {"employee": {"name: "ryan", points: "10"}, "name": "org1","payroll" "true"}}

注意:以同样的方式,我们可以拥有JSON配置/设置文件。

输出(应该是通用格式,并且只能存储在JSON中):

格式:     {Employee_name:str     Employee_mark:int     Employee_org:str     Employee_org_Payroll:布尔值     Employee_extras:对象/数组}

数据:

{
        {Employee_name: abcd, Employee_mark: 10, Employee_org: org1, Employee_org_Payroll: true, Employee_extras: NULL},
        {Employee_name: deft, Employee_mark: 10, Employee_org: org2, Employee_org_Payroll: false, Employee_extras: NULL},                         
        {Employee_name: ryan, Employee_mark: 10, Employee_org: org3, Employee_org_Payroll: true, Employee_extras: NULL}
}

简单解决方案::为每种xml(1,2,3 ..)格式编写专用的类或方法,并在代码级别指定节点。

(我自己)预期的解决方案: 编写一个通用解析器-触发解析器时,加载一个xml文件,读取并理解xml格式(1/2/3 /.../ etc),加载该xml格式的相应配置文件,对其进行处理,然后写入generic / common xml输出格式。

请让我知道我的问题或内容不清楚或需要更多信息。我在这里!

谢谢前进!

1 个答案:

答案 0 :(得分:0)

最后,我找到了这段代码,该代码主要为我的问题提供了解决方案,唯一的问题是,无论是否用于或HashMap,我都需要工作。

https://github.com/niteshapte/generic-xml-parser