使用SAX搜索并替换XML字段内容

时间:2017-11-08 19:12:56

标签: java xml sax

在我的Java映射中,我需要搜索并替换一个ASCII控件字符(BEL - 0x07),这会破坏映射路由,因为XML无法被解析。

当前的java映射虽然是对控件字符进行硬编码(只是为了简单并使其工作),但是由于源XML大小(~5mb)而需要很长时间(30分钟)。< / p>

使用SAX解析器会更快吗? 任何人都可以对此有所了解,因为我在Java编码方面不是很有经验吗?

这是我的源XML:

<?xml version="1.0" encoding="utf-8"?>
<ns:ShipmentStatusList xmlns:ns="http://namespace.com/ShipmentStatus/100">
    <Recordset>
        <ShipmentStatusDetails>
            <CarInitial>CAR</CarInitial>
            <CarNumber>1111</CarNumber>
            <ShipmentDateTime>07-14-201700:00:00:000</ShipmentDateTime>
            <TripRef1/>
            <TripRef2/>
            <TripRef3>7/18/2016</TripRef3>
            <TripComments>[BEL] control char comes here</TripComments>
            <ReturnCity>City</ReturnCity>
            <ReturnState>ST</ReturnState>
        </ShipmentStatusDetails>
    </Recordset>
</ns:ShipmentStatusList>

我需要最简单/最快捷的方式来搜索并用可读字符替换[BEL]字符(例如&#39;?&#39;)。

以下是我的工作方式(对于重源XML需要很长时间才能完成):

package com.map;

import java.io.BufferedReader;
import java.io.File;
import java.io.InputStream;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.io.Reader;

import com.sap.aii.mapping.api.AbstractTransformation;
import com.sap.aii.mapping.api.StreamTransformationException;
import com.sap.aii.mapping.api.TransformationInput;
import com.sap.aii.mapping.api.TransformationOutput;

public class JM_StrippingInvalidXMLChars extends AbstractTransformation {

    public static void main(String[] args){
        /*This method is used only for tests (reading and writing files locally)*/
        try{
            InputStream input = new FileInputStream(new File("C:\\Sync\\in.xml"));

            String inputPayload = JM_StrippingInvalidXMLChars.convertInputStreamToString(input);
            String outputPayload = JM_StrippingInvalidXMLChars.cleanUp(inputPayload);

        }catch(Exception e){
            e.printStackTrace();
        }
    }

    public void transform(TransformationInput arg0, TransformationOutput arg1)
    throws StreamTransformationException {
        getTrace().addInfo("JAVA program StrippingInvalidXMLChars started");
        String inputPayload = convertInputStreamToString(arg0.getInputPayload().getInputStream());
        String outputPayload = cleanUp(inputPayload);
        try {
            arg1.getOutputPayload().getOutputStream().write(
                    outputPayload.getBytes("UTF-8"));
        } catch 
            (Exception exception1) {
        }
}

    public static String convertInputStreamToString(InputStream in) {
        StringBuffer sb = new StringBuffer();
        try {
            InputStreamReader isr = new InputStreamReader(in);
            Reader reader = new BufferedReader(isr);
            int ch;
            while ((ch = in.read()) > -1) {
                sb.append((char) ch);
            }
            reader.close();
        } 
        catch (Exception exception) {
        }
        return sb.toString();
    }

    public static String cleanUp(String instructions){
        String res = "";
        String c = "";
        String strRplc = "?";
        int hexCode = 0;

        if (instructions != null && instructions.length()>0){
            for (int i=0; i<instructions.length(); i++){
                c = instructions.substring(i, i+1);
                hexCode = c.hashCode();
                //if (hexCode>31 && hexCode<127 && hexCode != 96){
                if (hexCode!= 07){
                    res = res.concat(c);
                }else{
                    res = res.concat(strRplc);
                }
            }
        }
        return res;
    }
}

0 个答案:

没有答案