从查询

时间:2017-11-21 10:27:29

标签: sql hadoop impala

在cloudera的Impala指南中( https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_conversion_functions.html)有一个例子演示了使用typeof()函数来检查数值表达式的返回数据类型:

+--------------------------+
| typeof(5.30001 / 2342.1) |
+--------------------------+
| DECIMAL(13,11)           |
+--------------------------+

当我这样做时:

select typeof(5.30001),typeof(2342.1),typeof(5.30001 / 2342.1);

它给出的是这样的

DECIMAL(6,5)    DECIMAL(5,1)    DECIMAL(13,11)

我的意思是前两个是显而易见的,但我不知道为什么第三个数据类型是这样的。可以从数值表达式本身确定返回的数据类型吗?另外,对于我表示为decimal(13,5)/decimal(25,4)(例如)的列分割表达式,有没有办法确定返回的数据类型应该是什么?感谢。

1 个答案:

答案 0 :(得分:0)

这确实有点模糊。以下是执行算术十进制类型转换的相关代码。

  /**
   * Returns the result type for (t1 op t2) where t1 and t2 are both DECIMAL, used when
   * DECIMAL version 2 is enabled.
   *
   * These rules are similar to (post Dec 2016) Hive / sql server rules.
   * http://blogs.msdn.com/b/sqlprogrammability/archive/2006/03/29/564110.aspx
   * https://msdn.microsoft.com/en-us/library/ms190476.aspx
   *
   * TODO: implement V2 rules for ADD/SUB.
   *
   * Changes:
   *  - There are slight difference with how precision/scale reduction occurs compared
   *    to SQL server when the desired precision is more than the maximum supported
   *    precision.  But an algorithm of reducing scale to a minimum of 6 is used.
   */
  private static ScalarType getDecimalArithmeticResultTypeV2(Type t1, Type t2,
      ArithmeticExpr.Operator op) throws AnalysisException {
    Preconditions.checkState(t1.isFullySpecifiedDecimal());
    Preconditions.checkState(t2.isFullySpecifiedDecimal());
    ScalarType st1 = (ScalarType) t1;
    ScalarType st2 = (ScalarType) t2;
    int s1 = st1.decimalScale();
    int s2 = st2.decimalScale();
    int p1 = st1.decimalPrecision();
    int p2 = st2.decimalPrecision();
    int resultScale;
    int resultPrecision;

    switch (op) {
      case DIVIDE:
        // Divide result always gets at least MIN_ADJUSTED_SCALE decimal places.
        resultScale = Math.max(ScalarType.MIN_ADJUSTED_SCALE, s1 + p2 + 1);
        resultPrecision = p1 - s1 + s2 + resultScale;
        break;
      case MOD:
        resultScale = Math.max(s1, s2);
        resultPrecision = Math.min(p1 - s1, p2 - s2) + resultScale;
        break;
      case MULTIPLY:
        resultScale = s1 + s2;
        resultPrecision = p1 + p2 + 1;
        break;
      case ADD:
      case SUBTRACT:
      default:
        // Not yet implemented - fall back to V1 rules.
        return getDecimalArithmeticResultTypeV1(t1, t2, op);
    }
    // Use the scale reduction technique when resultPrecision is too large.
    return ScalarType.createAdjustedDecimalType(resultPrecision, resultScale);
  }

因此5.30001的精度为6,标度为5,精度为2342.1,标度为1.我们的s1 = 5,p2 = 5.输出标度为s1 + p2 + 1 == 11,精度为6 - 5 + 1 + 11 == 13。这些规则在评论链接中有更好的解释。