列出groupingBy Objects字段和计数出现次数

时间:2017-11-25 21:15:45

标签: java java-stream

我有一个List<TermNode>,其中包含+*等操作的所有操作数。也就是说,它不是二叉树结构,而是一个树,其中每个节点可以包含多个子节点。 TermNode可以是变量,运算符,函数或数字。数字和变量节点包含一个空的子列表(将子项视为操作的参数)。

public class TermNode {

    /**
     * The value this {@link TermNode} holds.
     */
    private String value;

    /**
     * The arguments of this {@link TermNode}.
     * 
     * This {@link List} is empty if the {@link TermTypeEnum} of this {@link TermNode}
     * is either {@link TermTypeEnum}.NUMBER or {@link TermTypeEnum}.VARIABLE.
     */
    private List<TermNode> children;

    /**
     * The {@link TermTypeEnum} of this {@link TermNode}.
     */
    private TermTypeEnum type;

    public TermNode(ComplexNumber number) {
        this(number.toString(), new ArrayList<TermNode>(), TermTypeEnum.NUMBER);
    }

    public TermNode(String variable) {
        this(variable, new ArrayList<TermNode>(), TermTypeEnum.VARIABLE);
    }

    public TermNode(Operator operator, List<TermNode> children) {
        this(operator.getOperator(), children, TermTypeEnum.OPERATOR);
    }

    public TermNode(Function function, List<TermNode> children) {
        this(function.getName(), children, TermTypeEnum.FUNCTION);
    }

    public TermNode(String value, List<TermNode> children, TermTypeEnum type) {
        this.value = value;
        this.children = children;
        this.type = type;
    }

    public TermNode(TermNode node) {
        this.value = node.getValue();
        this.children = node.getChildren();
        this.type = node.getType();
    }

    /**
     * 
     * @return The value of this {@link TermNode}.
     */
    public String getValue() {
        return value;
    }

    /**
     * 
     * @return The {@link List} of arguments for this {@link TermNode}.
     */
    public List<TermNode> getChildren() {
        return children;
    }

    /**
     * 
     * @return The {@link TermTypeEnum} of this {@link TermNode}.
     */
    public TermTypeEnum getType() {
        return type;
    }

    /**
     * 
     * @return True, if this {@link TermNode} represents a {@link ComplexNumber}.
     */
    public boolean isNumber() {
        return this.type == TermTypeEnum.NUMBER;
    }

    /**
     * 
     * @return True, if this {@link TermNode} represents a variable.
     */
    public boolean isVariable() {
        return this.type == TermTypeEnum.VARIABLE;
    }

    /**
     * 
     * @return True, if this {@link TermNode} represents an {@link Operator}.
     */
    public boolean isOperator() {
        return this.type == TermTypeEnum.OPERATOR;
    }

    /**
     * 
     * @return True, if this {@link TermNode} represents a {@link Function}.
     */
    public boolean isFunction() {
        return this.type == TermTypeEnum.FUNCTION;
    }

    /**
     * 
     * @return A {@link HashSet} object to compare two {@link TermNode} elements in equality.
     */
    public HashSet<TermNode> getHash() {
        return new HashSet<>(children);
    }

}

为了简化x + x + x + xsin(x) * sin(x)这样的表达式,我开始实现一种方法,将x + x + x + x之类的节点更新为4 * x节点。

从表达式开始,我将提供表达式x + x + x + x

的示例
  1. 计算表达式的反向修饰符号/修订后顺序。

    结果为x x x x + + +

  2. 使用TermNodes创建表达式树/术语树(请参阅上面TermNode的代码)。算法如下:

    public void evaluate() {
        Stack<TermNode> stack = new Stack<TermNode>();
    
        for(String token : rpn) {
            if(OperatorUtil.containsKey(token)) {
                Operator operator = OperatorUtil.getOperator(token);
                List<TermNode> children = new ArrayList<TermNode>() {{
                    add(stack.pop());
                    add(stack.pop());
                }};
    
                TermNode node = simplifyOperator(operator, children);
                stack.push(node);
            } else if(FunctionUtil.containsKey(token.toUpperCase())) {
                Function function = FunctionUtil.getFunction(token.toUpperCase());
                List<TermNode> children = new ArrayList<TermNode>(function.getNumParams());
    
                for(int i = 0; i < function.getNumParams(); i++) {
                    children.add(stack.pop());
                }
    
                TermNode node = simplifyFunction(function, children);
                stack.push(node);
            } else {
                if(VariableUtil.containsUndefinedVariable(token.toUpperCase())) {
                    stack.push(new TermNode(token));
                } else {
                    stack.push(new TermNode(new ComplexNumber(token)));
                }
            }
        }
    }
    
  3. simplifyOperator方法的某处,我想用相同的值折叠变量。对于上面的表达式,树看起来像这样:

                _______  OPERATOR  _______
               /           "+"            \
              /         /       \          \
        VARIABLE    VARIABLE   VARIABLE    VARIABLE
          "x"          "x"       "x"          "x"
    
  4. 目标是通过评估children中的TreeNode字段将此树转换为更简单的树:此表达式由一个OPERATOR TermNode和运算符{{1}组成子项+包含List<TermNode> TermNode,值VARIABLE 次(不是同一TermNode的四倍,只有4个TermNode,完全相同)价值观,儿童和类型。这是我的问题出现的时候:

    x

    通过将参数private TermNode rearrangeTerm(Operator operator, List<TermNode> children) { Map<TermNode, Integer> occurrences = children.stream() .collect(Collectors.groupingBy(term -> term, Collectors.summingInt(term -> 1))); List<Pair<TermNode, Integer>> simplification = occurrences.entrySet().stream() .map(entry -> new Pair<TermNode, Integer>(entry.getKey(), entry.getValue())).collect(Collectors.toList()); List<TermNode> rearranged = simplification.stream().map(pair -> rearrangeTerm(operator, pair)).collect(Collectors.toList()); return new TermNode(operator.getOperator(), rearranged, TermTypeEnum.OPERATOR); } 传递给操作符为rearrangeTerm(operator, children)运算符的方法来调用此方法,+是包含变量的children元素列表TermNode四次。

    此时x term -> term不会将TermNode元素按其字段值(值,子项和类型)分组,而是将其作为TermNode引用本身。问题是,如何按字段对Collectors.groupingBy元素进行分组(如果所有字段匹配,则两个TermNode元素相等(子元素的顺序)列表可能不同,但重要的是每个TermNode的子列表中元素的出现匹配。当然TermNode类型也必须匹配。这里的问题只是如何更改TermNode方法,以便Map只包含一个带有(“x”,4)表达式rearrangeTerm的条目?

2 个答案:

答案 0 :(得分:1)

感谢您的回答,我最终不得不重写我的equals(Object obj)课程的hashCode()TermNode

我关注了多个stackoverflow站点并使用了这种方法:

@Override
public boolean equals(Object other) {
    if(this == other) {
        return true;
    }

    if((other == null) || (other.getClass() != this.getClass())) {
        return false;
    }

    TermNode node = (TermNode)other;

    return this.value.equals(node.getValue()) 
            && PlotterUtil.isEqual(this.children, node.getChildren()) 
            && this.type == node.getType();
}

由于TermNode类包含List<TermNode>字段,因此需要检查参数列表是否相等,但忽略参数的顺序,因为参数是单个值,如一个数字或一个变量,或者它们包含在另一个TermNode中,如OperatorFunction永远不应该忽略参数的顺序。为了检查两个列表是否相等,我使用了here中的代码。

public static boolean isEqual(List<TermNode> one, List<TermNode> two) {
    if(one == null && two == null) {
        return true;
    }

    if((one != null && two == null) || (one == null && two != null)) {
        return false;
    }

    if(one.size() != two.size()) {
        return false;
    }

    one = getSortedList(one);
    two = getSortedList(two);

    return one.equals(two);
}

这也解决了

的问题
Map<TermNode, Integer> occurrences = 
    children.stream().collect(Collectors
        .groupingBy(term -> term, Collectors.summingInt(term -> 1))
    );

因为Map<TermNode, Integer>通过对象的term -> termequals(Object obj)方法检查hashCode()中的相等键。

答案 1 :(得分:0)

这对于评论来说太长了,所以张贴作为答案......

所以一般的想法是你需要能够判断两个TermNode对象是否相同;为此,您可以向对象添加一个方法,该方法将展平您的对象,并以某种方式递归地计算每个字段的String并按字母数字排序。由于您的对象仅由两个字段组成:valueTermTypeEnum,这些字段不会太难完成,因此可能会(显然没有编译):

private List<String> flattened;

private List<String> flatten(TermNode node){
    if(node.getChildren() > 0){
        // some recursion here
    } else {
        // probably some checks here
        flattened.add(node.value + node.type)
    }
}

// this will give you a sorted String from all the values
public String toCompareBy(){
    return flatnned.sorted()... 
}

所以你的代码:

Map<TermNode, Integer> occurrences = children.stream()
        .collect(Collectors.groupingBy(
            term -> term, 
            Collectors.summingInt(term -> 1)));

可以改为这样:

 children.stream()
         .collect(Collector.toMap(
               Function.identity(),
               Collectors.summingInt(x -> 1),
               (left, right) -> {
                    return left + right;// or left + 1
               }
               () -> new TreeMap<>(Comparator.comparing(TermNode::toCompareBy))
         ))