Question

目前Spark有两个Row实现：

import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.InternalRow

需要两个人都有什么需要？它们是代表相同的编码实体，但一个在内部使用（内部API），另一个与外部API一起使用？

Answer 1

Row是Row的稳定版本实现。但是，InternalRow听起来很象是在Spark SQL内部使用。我在下面引用了InternalRow的文档：

/**
 * An abstract class for row used internally in Spark SQL, which only contains the columns as
 * internal types.
 */
abstract class InternalRow extends SpecializedGetters with Serializable {

Spark的Row和InternalRow类型之间的差异

1 个答案: