数据格式转换为父子格式的自动化

时间:2010-03-06 01:04:20

标签: java excel vbscript automation

这是一张excel表,每行只填充一列。 (解释:所有CITY类别都属于V21,所有手机类别都属于CityJ等等)

   V21                  
       CITYR
       CITYJ
           HandsetS
           HandsetHW
           HandsetHA
               LOWER_AGE<=20
               LOWER_AGE>20     
                   SMS_COUNT<=0 
                       RECHARGE_MRP<=122
                       RECHARGE_MRP>122
                   SMS_COUNT>0

我需要将此格式更改为双列格式 父母和子类别格式。 因此 输出表将是

    V21           CITYR
    V21           CITYJ
    CITYJ         HandsetS
    CITYJ         HandsetHW
    CITYJ         HandsetHA
    HandsetHA     LOWER_AGE<=20
    HandsetHA     LOWER_AGE>20      
    LOWER_AGE>20    SMS_COUNT<=0    
    SMS_COUNT<=0    RECHARGE_MRP<=122
    SMS_COUNT<=0    RECHARGE_MRP>122
    LOWER_AGE>20    SMS_COUNT>0

数据很大,所以我不能手动完成。我该如何自动化?

2 个答案:

答案 0 :(得分:3)

这项任务有3部分,所以我想知道你在寻求帮助的是什么。

  1. 将Excel工作表数据读入Java
  2. 操纵数据
  3. 将数据写回Excel工作表。
  4. 您已经说过数据表很大,无法将其作为一个整体提取到内存中。我可以问你有多少顶级元素?即,你有多少V21?如果它只是一个,那么你有多少CITYR / CITYJ?

    -

    从我之前的回答中添加一些关于如何操作数据的源代码。我给它一个输入文件,它被标签分隔(4个空格等于你在excel中的一个列),下面的代码整齐地打印出来。请注意,有一个等级为== 1的条件为空。如果你认为你的JVM有太多的对象,你可以在那一点清除条目和堆栈:)

    package com.ekanathk;
    
    import java.io.BufferedReader;
    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.List;
    import java.util.Stack;
    import java.util.logging.Logger;
    
    import org.junit.Test;
    
    class Entry {
        private String input;
        private int level;
        public Entry(String input, int level) {
            this.input = input;
            this.level = level;
        }
        public String getInput() {
            return input;
        }
        public int getLevel() {
            return level;
        }
        @Override
        public String toString() {
            return "Entry [input=" + input + ", level=" + level + "]";
        }
    }
    
    public class Tester {
    
        private static final Logger logger = Logger.getLogger(Tester.class.getName());
    
        @SuppressWarnings("unchecked")
        @Test
        public void testSomething() throws Exception {
    
            InputStream is = Thread.currentThread().getContextClassLoader().getResourceAsStream("samplecsv.txt");
            BufferedReader b = new BufferedReader(new InputStreamReader(is));
            String input = null;
            List entries = new ArrayList();
            Stack<Entry> stack = new Stack<Entry>();
            stack.push(new Entry("ROOT", -1));
            while((input = b.readLine()) != null){
                int level = whatIsTheLevel(input);
                input = input.trim();
                logger.info("input = " + input + " at level " + level); 
                Entry entry = new Entry(input, level);
                if(level == 1) {
                    //periodically clear out the map and write it to another excel sheet
                }
                if (stack.peek().getLevel() == entry.getLevel()) {
                    stack.pop();
                }
                Entry parent = stack.peek();
                logger.info("parent = " + parent);
                entries.add(new String[]{parent.getInput(), entry.getInput()});
                stack.push(entry);
            }
            for(Object entry : entries) {
                System.out.println(Arrays.toString((String[])entry));
            }
        }
    
        private int whatIsTheLevel(String input) {
            int numberOfSpaces = 0;
            for(int i = 0 ; i < input.length(); i++) {
                if(input.charAt(i) != ' ') {
                    return numberOfSpaces/4;
                } else {
                    numberOfSpaces++;
                }
            }
            return numberOfSpaces/4;
        }
    }
    

答案 1 :(得分:1)

这认为您有一个足够小的文件以适合计算机内存。即使是10MB的文件应该是好的。

它有两部分:

DataTransformer完成所有工作    需要转换数据

TreeNode是自定义的简单树数据    结构

public class DataTransformer {

    public static void main(String[] args) throws IOException {
        InputStream in = DataTransformer.class
                .getResourceAsStream("source_data.tab");
        BufferedReader br = new BufferedReader(
                new InputStreamReader(in));
        String line;
        TreeNode root = new TreeNode("ROOT", Integer.MIN_VALUE);
        TreeNode currentNode = root;
        while ((line = br.readLine()) != null) {
            int level = getLevel(line);
            String value = line.trim();
            TreeNode nextNode = new TreeNode(value, level);
            relateNextNode(currentNode, nextNode);
            currentNode = nextNode;
        }
        printAll(root);
    }

    public static int getLevel(String line) {
        final char TAB = '\t';
        int numberOfTabs = 0;
        for (int i = 0; i < line.length(); i++) {
            if (line.charAt(i) != TAB) {
                break;
            }
            numberOfTabs++;
        }
        return numberOfTabs;
    }

    public static void relateNextNode(
            TreeNode currentNode, TreeNode nextNode) {
        if (currentNode.getLevel() < nextNode.getLevel()) {
            currentNode.addChild(nextNode);
        } else {
            relateNextNode(currentNode.getParent(), nextNode);
        }
    }

    public static void printAll(TreeNode node) {
        if (!node.isRoot() && !node.getParent().isRoot()) {
            System.out.println(node);
        }
        for (TreeNode childNode : node.getChildren()) {
            printAll(childNode);
        }
    }
}

class TreeNode implements Serializable {

    private static final long serialVersionUID = 1L;

    private TreeNode parent;
    private List<TreeNode> children = new ArrayList<TreeNode>();
    private String value;
    private int level;

    public TreeNode(String value, int level) {
        this.value = value;
        this.level = level;
    }

    public void addChild(TreeNode child) {
        child.parent = this;
        this.children.add(child);
    }

    public void addSibbling(TreeNode sibbling) {
        TreeNode parent = this.parent;
        parent.addChild(sibbling);
    }

    public TreeNode getParent() {
        return parent;
    }

    public List<TreeNode> getChildren() {
        return children;
    }

    public String getValue() {
        return value;
    }

    public int getLevel() {
        return level;
    }

    public boolean isRoot() {
        return this.parent == null;
    }

    public String toString() {
        String str;
        if (this.parent != null) {
            str = this.parent.value + '\t' + this.value;
        } else {
            str = this.value;
        }
        return str;
    }
}